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Abstract 

The  purpose  of  this  thesis  was  to:  (1)  identify  some 
promising  least  squares  selection  procedures  discussed  in 
the  literature,  (2)  introduce,  implement,  and  study  a  vari¬ 
able  selection  method  proposed  by  Alan  J.  Miller,  and,  (3) 
make  an  extension  of  Ross  J.  Hansen's  1988  thesis  research 
by  comparing  the  methods  he  ex2unined:  Minimum  MSE,  Minimum 
Sp,  and  Minimum  Cp  with  Miller's  method. 

To  expedite  a  comparative  analysis  of  Miller's  method 
and  the  other  methods,  Response  Surface  methodology  was 
employed  with  two  performance  measures.  The  first  was  the 
percentage  of  correct  variables  in  a  model.  The  second,  the 
Theoretical  Mean  Squared  Error  of  Prediction  (TMSEP),  mea¬ 
sured  the  predictive  error  between  the  model  selected  and 
the  theoretical  model.  Each  technique  was  applied  on 
generated  data  with  known  multicollinearities,  variances, 
random  predictors,  and  sample  sizes.  Both  performance 
measures  were  computed  for  models  selected  under  each  tech¬ 
nique.  A  full  factorial  design  using  each  performance 
measure  was  set  up  to  study  the  effectiveness  ol  each  vari¬ 
able  selection  technique  with  respect  to  the  known  data 
characteristics.  Equations  were  generated  which  related 
these  data  characteristics  to  each  combination  of  perfor- 

viii 


mancG  measure  and  selection  method.  A  graphical  analysis  of 
variance  was  performed  to  summarize  each  technique's  perfor¬ 
mance  . 

Miller's  method  was  shown  to  be  the  best  overall  tech¬ 
nique  for  selecting  models  with  the  highest  percentage  of 
correct  variables.  Minimum  MSE,  followed  closaly  by  Minimum 
Sp,  selected  models  with  the  least  TMSEP. 
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A  COMPARISON  OF  VARIABLE  SELECTION  CRITERIA 
FOR  MULTIPLE  LINEAR  REGRESSION:  A  SECOND  SIMULATION  STUDY 


I.  Introduction 


Background 

Linear  regression  is  a  statistical  model-building  tool 
that  uses  data  to  construct  a  mathematical  expression  capa¬ 
ble  of  estimating  the  actual,  but  unknown,  relationship 
between  a  set  of  independent  or  predictor  variables  and 
their  corresponding  response  values.  This  mathematical 
expression  or  model  can,  with  a  certain  degree  of  accuracy, 
predict  the  level  of  response  of  the  associated  phenomena, 
given  a  set  of  predictor  values.  The  methodologies,  pro¬ 
cesses  and  techniques  employed  to  select  which  predictor 
variables  to  include  in  a  model  form  a  sub-topic  of  linear 
regression  called  subset  selection.  Unfortunately,  it  is 
often  difficult  to  determine  the  "best”  set  of  predictor 
variables  to  include  in  a  linear  regression  model  (Hansen, 
1988:1).  Alan  J.l  Miller,  an  expert  in  the  field  of  subset 
selection,  assesses  the  situation  on  the  back  cover  of  his 
newest  book.  Subset  Selection  in  Regression: 

Most  scientific  computing  packages  contain  facili¬ 
ties  for  stepwise  regression,  and  often  for  "all 
subsets"  and  other  techniques  for  finding  "- 
best-fitting"  subsets  of  regression  variables. 
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The  application  of  standard  theory  can  be  very 
misleading  in  such  cases  when  the  model  has  not 
been  chosen  a  priori,  but  from  the  data.  There  is 
widespread  awareness  that  considerable  overfitting 
occurs,  and  that  prediction  equations  obtained 
after  extensive  "data  dredging"  often  perform 
poorly  when  applied  to  new  data.  (Miller,  1990 :co- 
ver) 

Clearly,  as  A.J.  Miller  points  out,  automated  subset 
selection  processes  are  not  foolproof.  Over-fitting  is 
likely  when  one  blindly  applies  an  automated  subset  selec¬ 
tion  method,  such  as  Stepwice  Regression,  to  data  containing 
both  significant  and  insignificant  (random)  predictors.  The 
automated  software  selects  predictors  on  the  basis  of  some 
preset  criteria  and  will  probably  find  the  "best  fit"  when  a 
large  number  of  these  predictors  are  included  in  the  model, 
including  any  random  ones.  At  first,  it  may  seem  that 
predictors  that  are  theoretically  independent  of  the  re¬ 
sponse  would  not  be  selected  because  they  contribute  nothing 
to  the  response.  Freedman,  however,  demonstrated  that  this 
is  not  necessarily  the  case.  His  research  indicates  a  good 
fit  could  result  even  when  a  model  is  constructed  from  only 
random  noise  predictors  (Freedman,  1983:153).  Furthermore, 
when  automatic  algorithms  compare  models  containing  only 
significant  predictors  and  those  containing  the  same  signif¬ 
icant  predictors  augmented  with  random  predictors,  one  of 
the  models  containing  randomness  is  often  selected.  This 
occurs  because  the  largest  sample  correlation  6unong  the 
random  predictors  can  become  significant  and,  in  turn,  cause 
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the  automated  algorithm  to  favor  a  model  composed  of  both 
significant  and  random  predictors. 

The  problem  of  over-fitting  emphasizes  that  when  the 
subset  selection  process  is  blindly  turned  over  to  automated 
algorithms  implemented  by  computer  software  packages,  the 
resulting  mathematical  equation  may  be  useless.  It  may 
model  the  data  and  the  noise  in  the  data  very  well  while 
failing  to  achieve  the  real  goal  of  modeling  the  underlying 
process  or  phenomena.  As  a  result,  when  one  uses  an  over¬ 
fitted  model  to  predict  future  response  levels,  it  generally 
performs  poorly  because  the  presence  of  the  random  predic¬ 
tors  effectively  mask  whatever  predictive  insight  the  sig¬ 
nificant  predictors  have  to  offer  (Cafarella,  1979:14). 
Sometimes  human  judgement,  tempered  by  years  of  experience, 
can  recognize  when  over-fitting  has  occurred,  can  discontin¬ 
ue  the  automatic  algorithm,  and  can  select  a  more  parsimoni¬ 
ous  model.  Which  criteria  to  use,  however,  in  selecting  a 
more  parsimonious  model  that  will  indeed  adequately  repre¬ 
sent  the  underlying  process  or  phenomena  may  not  be  readily 
known.  Obviously,  research  is  needed  to  determine  which 
subset  selection  criteria  perform  best  under  a  given  set  of 
circimstances . 
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Problem  Statement 


This  research  effort  collected  and  analyzed  data  on 
certain  subset  selection  techniques  to  better  understand  why 
they  perform  as  they  do.  The  objectives  of  this  research 
were:  (1)  identify  some  promising  least  squares  selection 

procedures  discussed  in  the  literature,  (2)  introduce, 
implement,  and  study  a  variable  selection  method  proposed  by 
Alan  J.  Miller,  and,  (3)  make  an  extension  of  Ross  Hansen's 
1988  thesis  research  by  comparing  the  methods  he  exeunined: 
Minimum  MSB,  Minimum  Sp,  and  Minimum  Cp  with  Miller's  meth¬ 
od. 

Assumptions 

Miller's  technique  requires  certain  assumptions  prior 
to  its  application.  First,  the  collected  data  must  be  a 
random  sample  from  the  population.  Next  the  error  terms  of 
the  least  squares  linear  regression  must  be  independent  and 
identically  distributed,  be  from  a  normal  distribution,  and 
have  a  mean  of  zero  and  constant  variance.  Finally,  when 
the  Stepwise  regression  is  run,  only  Forward  Selection  is 
used  with  a  threshold  F-value  low  enough  to  allow  the  selec¬ 
tion  of  at  least  one  known  random  predictor. 

Scope 

This  study  is  an  extension  of  Ross  Hansen's  research  in 
which  he  examines  three  subset  selection  criteria  (Minimum 
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MSE ,  Minimum  Sp,  and  Minimum  Cp)  under  varying  eunounts  of 
multicollinearity,  variable  variation,  number  of  variables, 
and  seunple  size.  Additionally,  this  study  examined  the 
performance  of  yet  a  fourth  subset  selection  criteria, 
previously  described  and  referred  to  as  Miller's  method, 
under  the  same  conditions.  The  performance  of  Miller's 
method  is  compared  to  the  performance  of  the  three  other 
criteria  Hansen  studied.  Contrasts  and  comparisons  are  made 
and  conclusions  are  drawn. 


II.  Concept  Overview 


Least  Squares  Regression 

Assumptions .  Certain  key  assumptions  must  be  made 

prior  to  constructing  a  least  squares  linear  regression. 

One  must  first  assume  the  collected  data  represents  the 

population  from  which  it  came.  That  is,  the  data  reflects 

the  normal  case  of  the  variable.  Secondly,  the  error  terms 

are  assumed  to  be  independent  and  identically  distributed, 

from  a  normal  distribution  with  a  mean  of  zero  and  variance 
2 

O  . 

Notation.  The  aim  of  linear  regression  is  to  calculate 
what  proportion  of  the  independent  variables  should  be  added 
or  subtracted  to  best  predict  the  dependent  variable.  In 
general,  the  linear  least  squares  regression  equation  is 
written ; 

r  =  Po  +  Pi^i  +  P2-Xi  +  •  +  Pjt^jc  +  e  (1) 

where : 

Y  is  the  observed  value  of  the  independent  vari¬ 
able 

Po  is  the  constant  term 

Pi,P2,...,Pk  are  constant  proportional  multipliers 
of  the  dependent  variables  Xi,X2, . . .  ,X|( 

k  is  the  number  of  independent  variables 

e  is  the  error  term. 


6 


If  there  are  n  observations,  or  data  points,  the  above 
equation  may  be  written  as: 

^  ^  (Poi  +  Pii'^i.i  ■*”  Pai-^i  +•  •  •  +  Pjti‘*jti  ■*■*!)  (2) 

For  convenience,  the  above  equation  can  be  written  in  matrix 
notation. 

r  «  2p  +  e  (3) 

where  Y  is  an  nxl  column  vector: 
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The  first  column  contains  all  ones  for  the  constant  terms. 
The  remaining  columns  contain  the  X^j  independent  variables. 
The  X  matrix  is  commonly  referred  to  as  the  design  matrix. 

B  is  a  kxl  column  vector  of  regression  coefficient: 

(6) 


and  e  is  a  nxl  column  vector  of  error  terms: 

(7) 

In  least  squares  regression,  each  subset  of  regression 
variables  generates  a  surface  which  minimizes  the  squared 
distance  (error)  between  the  observed  values  for  the  depen¬ 
dent  variables,  Y,  and  the  predicted  values  for  the  depen¬ 
dent  variable, 

minj^  (ey)^  =  (8) 
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The  goal  is  to  find  the  subset  of  variables  which 
minimizes  the  squared  distances  between  the  actual  values 
observed  and  the  fitted  surface.  The  sum  of  the  squared- 
error  values  is  commonly  referred  to  as  the  sum-of-squares 
error  (SSE).  Graphically,  a  regression  resembles  the  fol 
lowing: 


Figure  1.  Two-diuensional  Representation  of  Linear  Least 
Squares  Regression 


SSR  is  the  sum  of  the  squared  distances  from  the  mean 
to  the  regression  line,  called  Regression  Sum  of  Squares. 
SSE  is  the  sum  of  the  squared  distances  from  the  point  to 
the  regression  line,  called  Sum  of  Squares  Error.  SST  is 
called  Sum  of  Squares  Total  and  is  calculated  by: 

SST  -  SSR  +  SSE  -  (9) 
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Regression  Analysis,  as  a  branch  of  statistical  mathe¬ 
matics,  began  In  the  late  1800's  when  Sir  Francis  Galton 
first  attempted  to  use  practical  mathematical  techniques  to 
Investigate  the  dependence  between  two  variables:  the  height 
of  the  parents  (he  used  the  average  of  the  parents  heights) 
and  the  heights  of  their  adult  children.  Having  randomly 
collected  (Scunpled)  many  pairs  of  parent/chlld  height  mea¬ 
surements  (data),  Galton  observed  that  for  a  given  parent- 
height  average,  the  conditional  mean  of  the  he:.ghts  of 
children  with  that  given  average  parent  height  "regressed" 
toward  the  mean  height  of  all  children.  Thus,  the  term 
regression  analysis  was  born  (Neter  and  others,  1990:26). 
Regression  techniques  have  since  been  developed  that  can 
construct  an  equation  or  mathematical  model  based  on  past 
historical  data  and  then  use  this  model  to  predict  future 
responses  (Neter  and  others,  1990:27).  - 

Subset  Selection 

Subset  selection  Is  an  area  of  regression  analysis 
concerned  with  choosing  the  "best"  variables,  or  predictors, 
to  Include  In  the  regression  model  (Hocking,  1983:220).  The 
simple  parent/chlld  height  model  yielded  only  two  choices: 
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one  could  accept  the  model  or  reject  it.  Accepting  the 
model  meant  that  if  one  specified  the  parent-height,  an 
estimate  of  the  adult  child-height  was  automatically  gener¬ 
ated.  Consider,  however,  the  complexity  that  occurs  if  one 
not  only  possesses  the  heights  of  the  parents  but  also  their 
right  arm  lengths.  Then  one  would  have  to  decide  whether  to 
use  a  model  to  predict  adult  child-height  based  on  the 
height  of  the  parents,  or  the  right  arm  length  of  the  par¬ 
ents,  or  both,  or  neither.  The  methodologies  of  subset 
selection  can  help  suggest  which  predictors  to  use.  Unfor¬ 
tunately,  and  as  previously  addressed,  applying  these  meth¬ 
odologies,  without  discretion,  has  a  tendency  to  produce 
over-fitted  models  that  have  little  predictive  capability 
(Miller,  1990:12-13). 

In  spite  of  these  difficulties,  however,  subset  selec¬ 
tion  does  play  an  important  role  in  regression  analysis. 
While  other  areas  of  regression  analysis  detect  and  correct 
problems  in  the  data  prior  to  model  creation  or  verify  the 
adequacy  of  the  model  after  creation,  subset  selection 
techniques  actually  select  the  variables  or  predictors  that 
go  into  the  model.  These  techniques  are  subdivided  into  two 
major  groupings: 

(l)Least  Squares  regression  techniques 

(2}Biased  regression  techniques 
For  this  literature  review,  only  the  Least  Squares  regres¬ 
sion  techniques  will  be  addressed.  Selection  techniques  for 
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least  squares  have  an  advantage  over  bias  regression  tech¬ 
niques  in  that  the  estimators  are  the  best  linear  unbiased 
estimators  (BLUE). 

All-subsets  regression.  All-subsets  regression  does 
just  that — it  forms  a  regression  model  for  each  predictor  or 
combination  of  predictors.  Miller  claims  that  only  by  an 
exhaustive  search  of  all  2  -1  combinations  or  subsets  can 
one  be  guaranteed  to  find  the  best-fitting  model  (Miller, 
1984:391).  Once  generated,  various  criteria  may  be  employed 
xn  searchxng  all  2  -1  models  for  the  one  that  best  fits  the 
data.  The  all-subsets  variable  selection  criteria  addressed 
in  this  literature  review  are: 

(1)  Near-Optimal-Model  for  Mean  Square  Absolute  Errors 

(MSAE), 

(2)  Mallows  Cp, 

2 

(3)  Coefficient  of  Determination  or  R  , 

(4)  Maximum  Adjusted  R  or  Minimum  MSE, 

(5)  PRESSp( Prediction  Sum  of  Squares)  or  Sp. 

Although  exhaustive  and  guaranteed  to  find  the  "best" 

model  ("best"  being  defined  by  the  criteria  used),  the 
All-Subsets  method  has  two  drawbacks,  regardless  of  the 
criteria  involved.  First,  it  can  only  be  used  for  a  moder¬ 
ately  small  number 

of  predictors  because  the  number  of  possible  subsets  of 
predictors  almost  doubles  with  each  additional  predictor 
considered  (e.g.  1  for  1  predictor,  3  for  2,  7  for  3,  52‘*?87 
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for  19,  1048576  for  20,  33.5  million  for  25,  etc. ) (Miller, 
1990:56).  Consequently,  when  considering  a  realistic  number 
of  predictors  (15  to  25)  one  is  forced  to  use  a  less  exhaus¬ 
tive,  but  more  efficient,  subset  selection  technique  such  as 
Stepwise  regression.  Secondly,  All-Subsets  regression  is 
only  guaranteed  to  find  the  "best"  mcdel  if  all  significant 
pr  dictors  are  considered  (Marula,  1983:160).  If  the  group 
of  predictors  under  consideration  does  not  contain  all  the 
significant  predictors,  then  the  All-Subsets  approach  can 
not  find  the  "best"  overall  model,  but  will  produce  the 
"best"  model  for  the  predictors  considered  (Berk,  1978:3). 

Mallows  Cp.  Mallows  Cp  is  a  statistic  used  to 
determine  the  best  model  when  the  independent  variables  are 
fixed.  Cp  is  an  approximation  of  the  Mean  Squared  Error  of 
Prediction  (MSEP). 

^  -  12  (10) 

where  -  -  - - 

SSR  is  the  Regression  Sum  of  Squares 
S^  is  the  estimate  for  the  variance 
p  is  the  number  of  parameters 
n  is  the  number  of  data  points 

Theoretically  the  v^alue  of  Cp  is  p.  Therefore,  when  Cp 
is  approximately  equal  to  p,  the  model  is  good.  Draper  and 
‘^roith  suggest  using  this  criterion  in  conjunction  with 
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stepwise  regression  to  obtain  the  best  subset  (Draper, 
1981:341).  It  should  be  noted,  however,  as  the  variance 
approaches  zero,  the  Cp  statistic  can  not  be  calculated. 
Therefore  this  method  has  limitations  especially  when  the 
fit  is  perfect. 

Barr  pointed  out  a  weakness  of  Mallows  Cp.  Since  S  in 
the  Cp  statistic  is  estimated  from  the  original  variable 
pool,  it  could  be  biased  and  larger  than  the  true  variance 
(Barr:5).  If  this  is  the  case,  the  Cp  statistic  will  be  de¬ 
flated  causing  the  wrong  model  to  be  selected. 

A  limitation  of  Cp,  as  well  as  many  other  statistics, 
is  that  it  "dependts]  on  the  observed  data  only  through 
sufficient  statistics,  so  they  model  average  behavior  of  the 
fit  of  a  model  to  the  data”  (Weisberg,  1981:27).  Weisberg 
developed  a  procedure  which  allocates  the  Cp  statistic  to 
individual  cases.  The  advantage  of  Weisberg 's  procedure  is 
if  the  model  under  consideration  is  biased,  it  provides  a 
means  to  determine  the  bias  of  using  a  subset  model  instead 
of  the  entire  model  (Weisberg,  1981:28). 

Another  application  of  the  Cp  statistic  is  to  choose 
the  model  which  has  the  smallest  Cp  value.  (Judge,  1985:863) 
By  choosing  the  model  with  the  minimum  Cp,  it  is  believed 
that  one  is  choosing  the  model  with  the  minimum  prediction 
error.  This  is  appealing,  especially  when  it  is  difficult  to 
determine  the  optimal  subset  using  the  Cp  close  to  p  crite- 
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rlon.  Since  the  Min  Cp  criterion  is  based  on  minimum  pre¬ 
diction  error  it  is  based  on  a  sound  principle.  However, 
like  the  Cp-close-to-p  criterion,  Min  Cp  is  derived  under 
the  assumption  that  the  independent  variables  are  fixed. 
Since  this  rarely  happens  in  practice,  there  is  some  ques¬ 
tion  to  the  usefulness  of  the  Min  Cp  criterion.  Judge, 
Griffiths,  Carter,  Lutkepohl,  and  Lee  recommend  that  the  Min 
Cp  procedure  not  be  used  in  any  applied  work  (Judge, 

1985:  864) 


Coefficient  of  Determination.  The  coefficient  of 
determination,  R  ,  is  a  statistic  which  gives  an  estimation 
of  the  amount  of  variation  about  the  mean  which  is  explained 
by  the  model. 


(11) 


where 


is  the  predicted  value  of  Yj. 


Tj  is  the  actual  value  of  Yj 
y  is  the  mean  of  Y. 


At  first  one  might  believe  that  it  is  desirable  to  find  the 
•  2 

model  which  has  the  maximum  R  ,  since  it  explains  the  most 
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variation  about  the  mean.  However,  this  is  not  necessarily 
the  best.  Certainly  when  we  look  at  the  R  value  we  would 
like  to  see  a  large  value,  but  it  should  not  be  used  as  the 
only  measure  for  subset  selection.  Maximum  receives 
little  praise  as  far  as  its  usefulness  in  determining  a 
good  fit.  The  major  pitfall  of  using  R^  is  that  whenever  a 
variable  is  added,  it  will  increase  R  .  R  will  increase 
regardless  of  whether  the  variable  has  anything  to  do  with 
the  dependent  variable.  According  to  Healy  1986,  "In  par¬ 
ticular,  the  multiple  correlation  coefficient  is  not  really 
a  regression-related  concept  at  all.  It  is  basically  de¬ 
fined  to  be  the  largest  possible  correlation  between  the 
y-variate  and  any  linear  function  of  the  x's  and  this  only 
liiakes  sense  when  y  and  x's  have  a  joint  probability  distri- 
bution"  (Healy,  1984:608).  If  maximum  R  is  used  as  the 
selection  criterion,  the  model  containing  all  variables  will 
always  be  selected. 

Maximum  Adjusted  R^  or  Minimum  MSB.  For  simplici- 
ty  only  Maximum  Adjusted  R  will  be  discussed.  However, 
Maximum  Adjusted  R^  and  Minimum  MSE  test  exactly  the  same 
thing.  Adjusted  R  is  related  to  R  ,  but  an  adjustment  has 
been  made  for  the  degrees  of  freedom.  The  following  equation 
shows  the  relationship  between  R  and  Adjusted  R  . 
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(14) 


Adjusted  J?’  =  1  -  (^-1) 

According  to  Draper  and  Smith,  the  adjusted  statistic 
can  be  used  not  only  to  compare  models  for  the  s£une  data  set 
(the  seune  variable  selection  discussed  in  all  other  sections 
of  this  literature  review) ,  but  also  to  compare  models  taken 
from  two  entirely  different  data  sets  (Draper,  1981:92). 
However,  they  do  not  recommend  using  the  Adjusted  statis¬ 
tic  in  thJ  latter  role.  The  Adjusted  R^  statistic  (or  the 
minimum  MSE  criterion)  is  still  widely  used  in  practice. 

i 

PRESSp  or  Sp.  The  Sp  criterion,  originally  pro- 

I 

posed  by  Hocking  in  1976  (Hocking,  1976:20),  has  consider- 

i 

able  appeal  and  consequently  receives  praise  in  recent 
years.  The  Sp  statistic  is  an  approximation  of  the  MSEP 
based  solely  on  the  data  and  number  of  variables.  As  is  the 
case  with  MSEP,  the  goal  of  this  criterion  is  to  find  the 
minimum  value. 


S  ■  _ _ 

^  <n  -  p)  (Ji  -  p  -  2) 


(15) 


Breiman  and  Freedman  point  out  that  the  Sp  statistic 
does  not  necessary  provide  an  accurate  approximation  of 
MSEP,  but  works  nonetheless  (Breiman,  1983:132). 

The  advantages  of  this  method  are  numerous.  Looking  at 
the  Sp  equation  gives  the  reader  an  idea  of  the  relative 
ease  with  which  Sp  is  calculated.  What  makes  Sp  even  more 
appealing  is  it  is  based  on  MSEP.  As  Thompson  points  out, 
"This  method  [Sp]  is  based  on  a  sound  criterion  —  that  of 
minimizing  the  expected  squared  distance  between  the  true 
and  predicted  values  of  the  dependent  variable,  Y"  (Thomp¬ 
son,  1978:6).  Since  Sp  is  an  approximation  of  MSEP,  it  can 
be  used  like  MSEP  to  determine  the  optimal  number  of  regres¬ 
sors  to  include  in  the  model  (Breiman,  1983:132). 

Sp  is  not  without  its  disadvantages.  It  must  be  calcu¬ 
lated  for  all  2k-l  possible  subsets  (Thompson,  1978:6). 

Even  though  it  requires  relatively  little  computational  ef¬ 
fort,  it  does  require  that  many  regressions  be  run.  Through 
counter  examples  Brieman  and  Freedman  show  that  when  true 
variance  due  to  prediction  equals  zero,  the  Sp  criterion 
fails  to  pick  the  optimal  number  of  variables  to  include  in 
the  model  (Breiman,  1983:132). 

Stepwise  Regression.  A  more  efficient  technique, 
called  Stepwise  regression,  does  not  consider  all  the  possi¬ 
ble  combinations  of  predictors,  but  selects  only  the  signif¬ 
icant  predictors  and  brings  them  into  the  model  one  at  a 


19 


time.  Stepwise  regression  exists  in  three  versions:  1) 
Forword  Selection,  2)  Backward  Elimination,  or  3)  a  combina¬ 
tion  of  1 )  and  2 ) .  Forward  Selection  starts  with  no  predic¬ 
tors  in  the  model.  It  then  adds  significant  predictors  to 
the  model  one  at  a  time.  At  each  iteration,  every  predictor 
not  yet  in  the  model  is  tested  for  significance  with  respect 
to  the  current  model,  adding  the  most  significant  one  to  the 
model.  The  process  continues  until  all  predictors  improving 
the  fit  of  the  model  are  included  in  the  model  (Thompson, 
1987:10).  At  no  point  are  variables  ever  taken  out  of  the 
model.  Backward  Elimination,  the  reverse  of  Forward  Selec¬ 
tion,  starts  with  every  known  predictor  already  in  the 
model.  At  each  iteration,  all  the  insignificant  predictors 
are  identified  with  respect  to  the  current  model,  and  the 
least  significant  predictor  is  eliminated.  This  process 
continues  until  tests  indicate  that  all  insignificant  pre¬ 
dictors,  with  respect  to  the  current  model,  have  been  elimi¬ 
nated.  At  no  point  are  variables  added  back  into  the  model 
(Thompson,  1987:10-11).  The  combination  of  both  techniques 
proceeds  like  Forward  Selection  except  that  Backward  Elimi¬ 
nation  is  implemented  at  each  step.  Each  predictor  is 
tested  for  significance  with  respect  to  the  current  model, 
and  the  most  significant  predictor  is  added  to  the  model. 
Each  time  a  new  predictor  is  brought  in,  every  predictor  in 
the  new  model  is  tested  with  respect  to  the  new  model  to 
make  sure  that  it  is  still  significant  after  the  addition  of 
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the  newest  predictor.  Predictors  in  the  model  are  ranked  by 
their  significance  and  the  least  significant  predictor  is 
eliminated.  This  process  continues  until  all  significant 
predictors  are  included  in  the  model  and  all  insignificant 
predictors  are  eliminated  (Thompson,  1987:11). 

The  overriding  question,  then,  is  how  does  one  measure 
significance  aunong  predictors?  The  most  common  measure  of 
significance,  called  the  F-statistic,  is  a  ratio  that  shows 
how  much  explanatory  power  a  predictor  brings  to  the  model 
under  consideration.  To  use  an  F-statistic  in  Forward 
Selection  stepwise  regression,  however,  one  must  decide  what 
numerical  threshold  of  the  F-statistic  is  appropriate. 
Selecting  a  small  threshold  F-value  may  inadvertently  admit 
random  predictors  into  the  model  while  choosing  a  large 
F-statistic  may  cause  significant  predictors  to  be  omitted. 

Miller's  Method.  Dr.  Alan  J.  Miller  suggests  an  alter¬ 
nate  subset  selection  method  —  one  which  he  theorizes  could 
guard  against  bringing  random  predictors  into  the  model .  He 
proposes  augmenting  the  set  of  predictors  with  an  equal 
number  of  "dummy”  predictors  whose  values  are  random  num¬ 
bers.  The  method  then  applies  Forward  Selection  stepwise 
regression  and  proceeds,  according  to  Miller,  until  the 
first  known  random  predictor  is  selected  for  inclusion  in 
the  model.  One  then  stops  the  Forward  Selection  stepwise 
regression  and  discards  the  current  model  which  includes 
this  known  random  predictor  and  uses  the  previous  model 
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(Miller,  1984:395).  Any  predictors  not  selected  must, 
therefore,  have  less  significance  than  th-  random  predictor 
that  Forward  Selection  attempted  to  select.  Thus,  all  pre¬ 
dictors  not  selected  should  be  discarded  as  insignificant 
(Hocking,  1983:220).  Just  how  well  this  subset  selection 
method  performs  on  data  plagued  with  collinearity  and  other 
problems  is  one  of  the  questions  which  inspired  this  re¬ 
search  effort. 


1 


IV.  Methodology  and  Model  Development 


Objective 

The  goal  of  this  thesis  is  to  gain  a  better  understand¬ 
ing  of  the  Minimum  MSE,  Minimum  Sp,  and  Minimum  Cp  variable 
selection  criteria  as  well  as  introducing  and  studying  yet  a 
fourth  selection  criteria;  Miller's  method.  The  four 
techniques  will  be  compared. 

Justification 

In  this  research  effort,  four  variable  selection  tech¬ 
niques  were  examined:  Minimum  MSE,  Minimum  Sp,  Minimum  Cp, 
and  Miller's  method.  These  methods  were  chosen  for  the 
following  reasons: 

(1)  Ross  Hansen's  1988  thesis  research  had  already 
studied  and  compared  minimum  MSE,  minimum  Sp,  and  minimum  Cp 
variable  selection  techniques.  The  methodology  and  system¬ 
atic  approach  he  developed  defined  and  guided  this  research 
effort.  However,  due  to  recently  discovered  computer  errors 
in  his  data  sets  and  analysis  programs,  much  of  Hansen's 
original  computations  have  been  re-worked. 

(2)  Each  of  these  techniques  lend  themselves  to  com¬ 
puter  implementation,  allowing  the  researcher  to  conduct 
useful  experiments  and  gain  credible  results  with  a  reason- 
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able  aunount  of  computational  effort.  This  is  possible 
because  each  of  these  techniques  involve  absolute  criterion. 
In  other  words,  all  four  methods  can  be  executed  by  a  series 
of  predetermined  decisions.  For  the  first  three  methods, 
the  MSB,  Sp,  and  Cp  statistics  for  each  data  set  of  vari¬ 
ables  can  be  calculated  by  the  SAS  (SAS,  1985:956)  all-sub- 
sets  regression  procedure,  R-Squared.  The  model  selected  by 
the  Minimum  MSB,  Minimum  Sp,  or  Minimum  Cp  methods  is  simply 
the  one  with  the  smallest  value  of  MSB,  Sp,  or  Cp,  respec¬ 
tively.'  Similarly,  for  Miller's  method,  a  model  for  each 
data  set,  augmented  with  the  appropriate  number  of  random 
predictors,  can  be  automatically  selected  using  the  SAS 
Stepwise  procedure  with  the  forward  selection  option. 
Miller's  model  is  the  largest  subset  of  predictors  from  the 
associated  SAS  model  such  that  each  predictor  is  added  in 
the  order  of  significance  determined  by  the  associated  SAS 
model  and  no  random  predictors  are  admitted.  Upon  encoun¬ 
tering  the  first  random  predictor,  the  selection  process 
terminates  and  the  current  model  becomes  the  model  for  that 
data  set. 

(3)  The  first  three  techniques  are  very  powerful,  as 
Hansen  points  out: 

The  first  three  techniques  appear  in  the  last 
decade's  literature.  The  Minimum  MSB  procedure 
used  to  be  one  of  the  most  widely  used  methods. 

Its  appeal  over  techniques  such  as  Max  stems 
from  its  adjustment  for  degrees  of  freedom.  More 
recently,  Sp  seems  to  have  become  the  most  popular 
technique.  Its  appeal  is  based  on  the  principal 
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of  minimizing  mean  square  errors  of  prediction 
(MSEP).  The  Cp  criterion  is  also  based  on  MSEP, 
and  some  authors  praise  this  criterion.  (Hansen, 
1988:31) 

(4)  A  formal  study  of  the  fourth  technique.  Miller's 
method,  has  not  been  reported  in  statistical  literature  to 
date.  Comparing  this  virtually  unknown  subset  selection 
technique  with  the  three  well-understood  techniques.  Minimum 
MSE,  Minimura  Sp,  and  Minimum  Cp  methods,  yielded  valuable 
insight  into  all  four  methods. 

Limitations 

Since  this  thesis  extensively  employs  least  squares  re¬ 
gression,  the  results  and  conclusions  are  valid  only  if 
certain  assumptions  can  be  made  about  the  data.  As  outlined 
in  Chapter  2,  the  data  must  be  assumed  to  be  representative 
of  the  population.  Likewise,  the  error  terms  must  be  as¬ 
sumed  to  be  independent  and  identically  distributed  from  a 
normal  population  with  an  expected  value  of  zero  and  a 
constant  variance  o^.  Finally,  each  predictor  must  be 
assumed  related  to  the  response  (Hansen,  1988:32). 

Overview 

The  methodology  and  approach  exercised  in  this  thesis 
will  be  similar  in  content  to  that  used  by  Ross  Hansen  in 
his  1988  study.  Only  a  slight  expansion  in  methodology 
occurs  with  the  additional  implementation  of  Miller's  subset 
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selection  method.  This  research  effort  can  be  divided  into 
roughly  four  areas  of  focus: 

(1)  Data  generation. 

(2)  Model  selection. 

(3)  Generation  and  analysis  of  a  perfoznnance  measure 
for  percentage  of  correct  variables. 

(4)  Generation  and  analysis  of  a  performance  measure 
for  method  accuracy. 

The  data  used  in  this  study  is  the  same  as  that  em¬ 
ployed  by  Hansen,  except  that  certain  computer  errors  have 
been  corrected.  The  data  sets  contain  various  known  and 
verifiable  statistical  properties. 

A  model  was  selected  from  each  data  set  using  each  of 
the  four  variable  selection  methods.  To  accomplish  this, 
preliminary  models  were  formed  using  SAS  all-possible  sub¬ 
sets  and  stepwise  regression  routines.  FORTRAN  routines 
then  performed  the  final  model  selection  process  for  each 
method  on  each  data  set.  _ _ _ _ 

Two  different  sets  of  performance  measures  were  calcu¬ 
lated.  The  first  set,  designated  PM,  was  used  to  evaluate 
what  effect  the  various  statistical  properties  of  the  data 
have  on  the  percentage  of  correct  variables  selected  in  a 
given  model.  Response  Surface  methodology  (RSM)  and  Box  and 
Whisker  plots  were  applied  to  determine  what  impact  specific 
statistical  properties  of  the  data  and  the  subset  selection 
technique  used  have  on  the  percentage  correct  variables 
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selected  for  a  given  group  of  models  (Hansen,  1988:32).  The 
second  set,  designated  TMSEP  for  Theoretical  Minimum  Mean 
Squared  Error  of  Prediction,  is  used  to  compare  the  accuracy 
of  one  subset  selection  technique  to  another.  This  is 
accomplished  by  comparing  models  created  under  different 
selection  techniques  to  the  theoretical  model  from  which  the 
data  was  originally  generated.  Box  and  Whisker  plots  were 
also  generated  to  analyze  the  impact  of  each  factor  on  the 
accuracy  of  the  models  a  method  selects. 


Data  Generation 

Since  this  study  compares  its  results  with  the  results 
of  Hansen's  study,  part  of  the  data  used  came  directly  from 
Hansen's  study.  The  Hansen  data,  however,  was  augmented 
with  an  equal  number  of  random  predictors  to  accommodate 
Miller's  method. 

The  data  for  this  study  was  generated  from  the  follow¬ 
ing  equation: 


X,i  +  X^i  *  C, 

where  is  the  response  variable. 


(16) 


li/  •  •  • 


are  randomly  generated  predictors. 


is  a  noise  term  to  create  variance  in 
the  model . 

Most  simulation  studies  investigate  subset  selection 
techniques  with  all  significant  predictors  plus  some  unknown 
random  variables  included  eunong  the  group  of  predictors  from 
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which  the  model  is  created.  This  study  attempted  to  find 
what  happens  when  one  of  the  significant  predictors  is 
deleted  entirely  from  consideration.  After  the  data  is 
created  by  equation  ( 1 ) ,  the  X4  predictor  is  dropped  from 
consideration.  This  simulated  the  situation  which  arises 
when  a  significant  predictor  is  unknown  and  not  considered. 
Additionally,  either  one  or  three  noise  variables  were 
included  in  the  predictor  pool  to  simulate  data  collected  on 
predictors  thought  to  be  significant  but,  in  reality,  extra¬ 
neous  (Hansen,  1988:34-35).  Furthermore,  when  Miller's 
variable  selection  method  was  implemented,  the  predictor 
pool  (consisting  of  both  significant  and  extraneous  predic¬ 
tors)  w.' s  doubled  in  size  by  the  addition  of  an  equal  number 
of  knov  n  random  predictors .  The  number  of  random  predictors 
added  always  equaled  the  number  variables  already  in  the 
predictor  pool.  In  practice,  however,  the  actual  data  sets 
were  not  permanently  expanded.  SAS  allowed  each  data  set  to 
be  temporary  expanded  whi] e  running  a  stepwise  analysis  and 
implementing  Miller's  method  on  each  data  set. 

Factors .  To  understand  how  the  vari<»us  statistical 
properties  of  the  data  effect  each  of  the  four  techniques 
studied,  six  potentially  significant  statistical  properties 
or  factors  were  chosen  a  priori  and  the  data  sets  were 
generated  based  on  these  six  factors.  RSM  was  used  to 
construct  an  equation  made  up  of  significant  factors  and 
factor  interactions  which  adequately  predicts  the  usefulness 


of  each  method  (Hansen,  1988:35-36).  The  six  factors  con¬ 
sidered  in  this  study  were: 

( 1 )  The  number  of  extraneous  variables  in  the  original 
group  of  predictors.  These  variables  model  predictors  which 
are  believed  significant,  but  are  actually  random,  extrane¬ 
ous  predictors  (denoted  by  EXi,  EXa,  EX3).  Because  these 
variables  are  noise,  they  are  theoretically  independent  of 
the  dependent  variable.  In  this  study,  at  the  low  setting 
the  number  of  extraneous  variables  is  1  and  at  the  high 
setting,  3. 

(2)  The  amount  of  correlation  among  the  predictors 
which  are  not  extraneous,  random  variables  (denoted  by  Xf, 

X?,  X3,  X4).  At  the  low  setting  the  variables  are  orthogon- 
a..,  or  have  zero  correlation,  while  at  the  high  setting  they 
are  highly  correlated  with  a  correlation  of  0.9. 

(3)  The  variance  of  the  extraneous  predictors.  The 
low  setting  for  the  variance  is  1,  and  the  high  setting  is 
100. 

(4)  The  variance  of  the  significant  predictors. 

The  low  setting  for  the  variance  is  1,  and  the  high  setting 
is  100. 

(5)  The  sample  size.  The  low  setting  for  sample  size 
is  10,  while  the  high  setting  is  20.  The  low  setting  was 
set  by  Ross  Hansen  in  his  study  of  the  Sp  criteria  —  any 
smaller  and  Sp  could  not  be  calculated.  Hansen's  bounds  on 
sample  size  were  adopted  to  facilitate  method  comparison 
(Hansen,  1988 : 35-36 ) . 

(6)  The  variance  of  the  error  term.  The  low  setting 
for  the  variance  of  the  error  term  is  0.0625,  and  the  high 
setting  is  0.25. 

Data  Sets.  Sixty  data  sets  were  generated  for  each  of 
the  64  high/low  combinations  of  the  six  factor  settings.  In 
the  literature,  each  combination  of  factor  settings  is 
typically  referred  to  as  a  design  point  in  the  experiment. 

In  this  case,  the  experiment  was  to  determine  what  effect 
each  of  the  six  factors  has  on  PM.  Hansen  wrote  automated 
routines  which  created  each  group  of  the  sixty  data  sets  at 
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each  of  the  sixty-four  design  points  and  put  them  in  a  file 
related  to  the  design  point.  For  this  thesis  effort  these 
files  were  renamed  Ol.dat,  02.dat,...,  64.dat).  In  all, 

3840  data  sets  were  generated.  Appendix  H  contains  FORTRAN 
code  which  was  used  to  verify  that  the  data  sets  do  indeed 
possess  appropriate  statistical  properties.  A  close  exami¬ 
nation  of  Hansen's  data  revealed  that. he  used  the  "natural 
order"  for  generating  all-possible  combinations  of  the 
factor  settings.  To  accomplish  this,  he  first  established  a 
permanent  factor  order  for  future  reference. 


Table  1. 

Factor  Order  for  Data  Generation 


Order 

Factor 

Description 

Values 

Factor 

Symbol 

Low 

High 

1 

#  of  extraneous 
predictors 

l.C 

3.0 

A 

2 

Correlation  among 
indep.  predictors 

0.0 

0.9 

B 

3 

Variance  of  ext. 
predictors 

1.0 

100.0 

C 

4 

Variance  of 
indep.  predictors 

1.0 

100.00 

D 

5 

Sample  Size 

10.0 

20.0 

E 

6 

Variance  of  the 
error  term 

0.0625 

0.25 

F 

The  factors  were  then  varied  according  to  the  "natural 
order".  Factor  A  is  varied  most  rapidly  from  its  low  to  high 


setting,  followed  by  Factor  B,  C,  D,  E,  and  F.  The  follow¬ 
ing  table  gives  a  example  of  factor  combinations  at  several 
design  points.  For  the  sake  of  brevity,  "A"  means  factor  A 
at  its  high  setting  and  "a"  means  factor  A  at  its  low  set¬ 
ting  and  so  forth. 


Table  2. 

Mapping  of  Design  Points  to  Factor.  Settings 


Design  Point 

Data  File 

Factor  Settlhga  1 

1 

01 .dat 

■MUM 

2 

02.dat 

3 

03.dat 

a  B  c  d  e  f 

4 

04.dat 

A  B  c  d  e  f 

5 

05.dat 

a  b  C  d  e  f 

6 

06.dat 

7 

07.dat 

a  B  C  d  e  f 

8 

08.dat 

A  B  C  d  e  f 

9 

09.dat 

• 

• 

■  • 

• 

• 

• 

64 

64.dat 

A  B  C  D  E  F 

Generating  the  data  in  this  systematic  fashion  results  in  an 
equation  relating  the  performance  measure  for  each  subset 
selection  method  to  these  six  factors.  Before  the  perfor¬ 
mance  measures  can  be  generated,  however,  models  must  be 
selected  using  each  technique  on  each  data  set. 
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Model  Selection 


Generally  speaking,  the  variable  selection  process  for 
all  four  methods  involved  employing  SAS  routines  to  develop 
a  set  of  models  and  then  filtering  through  those  iuoJel~  ^-rlth 
FORTRAN  programs,  selecting  a  model  by  each  method.  The 
implementation  of  this  methodology  was  similar  for  the 
Minimum  MSE,  Minimum  Sp,  and  Minimum  Cp  variable  selection 
techniques,  but  differed  for  Miller's  method.  Appendices  A 
and  B  clearly  outline  these  techniques  and  reveals  these  j 
differences. 

The  reader  should  keep  in  mind  that  the  best  model  at  ^ 

1 

each  design  point  consist  of  only  three  predictors:  Xi,  Xj, 

X3  because  X4  had  been  discarded  after  data  generation. 

i 

Extraneous  predictors,  EX^,  or  EXt,  EXa,  and  EX3,  were 
added  to  create  the  experiment.  Although  the  experimenter 
knew  these  were  extraneous  variables  and  that  they  should 
not  be  selected  for  inclusion  in  the  model,  the  three  sig¬ 
nificant  predictors  and  the  extraneous  predictor(s)  were 
presented  nevertheless  to  the  selection  process  as  legiti¬ 
mate  predictors. 

Minimum  MSE.  Minimum  Sp.  and  Minimum  Cp  Methods.  Sepa¬ 
rate  processing  was  performed  for  data  sets  possessing  one 
extraneous  predictor  and  those  with  three  extraneous  predic¬ 
tors  (see  Appendix  A).  The  all-possible  subsets  SAS  rou¬ 
tine,  RSquared,  was  used  to  generate  the  models.  Fifteen 
models  were  generated  for  design  points  with  4  variables  in 
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the  pool  and  63  models  for  design  points  with  6  variables  in 
the  pool  and  the  MSE,  Cp,  and  Sp  statistics  calculated  for 
each  model.  The  two  different  quantities  of  models  are  due 
to  the  number  of  predictors  (p)  being  considered  and  is 
equal  to  2'*-l.  FORTRAN  programs  then  filtered  through  the 
models  for  each  data  set  and  selected  three  models  for  each 
data  set:  one  with  the  smallest  -MSE  statistic,  one  with  the 
smallest  Cp  statistic,  and  one  with  the  smallest  Sp  statis¬ 
tic. 

Miller's  Method.  Again,  it  was  necessary  to  handle  the 
processing  separately  for  data  sets  possessing  one  extrane¬ 
ous  predictor  and  those  with  three  extraneous  predictors 
(see  Appendix  B) .  To  employ  Miller's  method,  the  data  sets 
were  purposely  augmented  with  an  equal  number  of  known 
random  predictors.  Depending  on  whether  the  number  of 

« 

extraneous  variables  in  the  data  set  is  1  or  3,  either  4  or 
6  random  predictors,  respectively ^  were  added  to  the  data 
set.  Miller's  method  effectively  doubled  the  number  of 
predictors  in  the  pool  at  each  design  point.  The  total 
number  of  predictors  under  consideration  by  Miller's  selec¬ 
tion  process  at  each  design  point  varied  from  8  (3  unknown 
true  predictors,  1  unknown  random  predictor,  4  known  random 
predictors)  to  12  (3  unknown  true  predictors,  3  unknown 
randoi.’  predictors,  6  known  random  predictors). 

Oroe  augmented,  the  SAS  Stepwise  routine  using  Forward 
Selection  processed  each  data  set.  One  should  note  that  the 
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F-to-enter  threshold  criteria  was  set  to  1  to  assure  that  at 
least  one  of  the  known  random  predictors  would  be  admitted 
to  the  model.  Once  a  model  was  generated  for  each  data  set, 
FORTRAN  programs  were  used  to  generate  a  model  for  each  data 
set  via  Miller's  method. 

Performance  Measure  fPM)  for  the  Percentage  of  Correct 
Variables 

Justification .  How  one  rates  the  performance  of  a 
subset  selection  technique  is  a  critical  issue.  Adopting  a 
reasonable,  logical  rating  system  eventually  led  to  the 
development  of  equations  which  related  the  success  of  a 
method  to  the  statistical  properties  of  the  data  to  which  it 
was  applied.  Hansen  contends  that  there  are  no  guaranteed 
methods  to  screen  out  extraneous  variables  (random  noise 
terms  which*  do  not  contribute  at  all  to  the  model ) ,  Fur¬ 
thermore,  he  contends  that  once  in  the  variable  pool,  there 
is  no  criterion  which  guarantees  that  no  extraneous  vari¬ 
ables  will  be  chosen  for  the  model.  Even  the  all-subset 
procedure,  which  A.J.  Miller  contends  performs  quite  well, 
occasionally  chooses  extraneous  variables  (Hansen,  1988- 
:32-33) . 

Since  there  really  are  no  "guaranteed  methods"  for 
capturing  all  the  true  variables,  an  excellent  measure  of 
performance  is  to  rate  the  success  of  a  subset  selection 
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method  by  the  percentage  of  variables  chosen  correctly^ 

This  index,  referred  to  as  PM,  is  calculated  as  follows: 

PM  »  correct  variables  chosen 

number  of  variables  chosen  '  ' 

This  study  used  PM  to  examine  the  relative  contribu¬ 
tions  of  the  six  factors  as  they  relate  to  the  performance 
of  each  of  the  four  subset  selection  methods  studied. 
Furthermore,  PM  is  a  logical  choice  for  two  reasons.  First, 
the  best  model  may  not  include  all  the  predictors  it  is 
generated  from,  but  only  the  most  significant.  Even  though 
a  response  value  may  have  been  generated  from  three  predic¬ 
tors,  the  best  model  may  only  contain  two  of  those  predic¬ 
tors.  Therefore,  PM  compensates  by  determining  the  percent¬ 
age  of  correct  variables  chosen.  Second,  PM  takes  in  to 
account  the  number  of  extraneous  variables  chosen.  It  is 
worse  to  select  a  model  with  only  two  predictors,  one  of 
which  is  extraneous,  than  it  is  to  select  a  model  containing 
five  predictors,  one  of  which  is  extraneous.  PM  adjusts 
accordingly  (Hansen,  1988:33). 

Calculation  of  PM.  At  each  of  the  64  design  points,  60 
models  were  generated.  FORTRAN  routines  examined  the  3840 
(64  times  60)  models  produced  and  selected  a  model  based  on 
the  criteria  for  each  method  studied.  In  this  final  stage 
of  model  selection  the  FORTRAN  programs  also  collected  the 
following  statistics  at  each  of  the  64  design  points: 

(1)  The  total  number  of  predictors  chosen  in  all 
60  data  sets. 
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(2)  The  total  number  of  correct  predictors  chosen  among 
all  60  data  sets  (Hansen,  1988:39). 

Using  these  statistics,  PM  was  calculated  for  each  method  at 
each  design  point.  With  PM  in  hand,  the  experiment  was  set 
up  and  the  relationships  determined  between  the  PM  and  the 
six  factors. 

Experiment .  When  using  RSM  it  is  convenient  to  work 
with  coded  factors  (-1,1  variables)  for  the  following 
reasons: 

(1)  By  coding  the  factors,  the  resulting  predictors  are 
of  the  same  magnitude. 

(2)  Calculations  are  simplified. 

(3)  The  resulting  design  matrix,  Z,  is  orthogonal. 
Consequently,  stepwise  regression  can  be  used  to 
find  the  significant  factors  with  confidence 

(4:36) . 

In  general,  translating  a  variable  from  uncoded  space 
to  coded  space  is  as  follows: 


HIGH*  LOW 


HIGH  LOW 
2  "  2 


(18) 


where  X  is  the  variable  in  uncoded  space 
Z  is  the  variable  in  coded  space 
HIGH  is  the  upper  bound  on  the  uncoded 
variable 

LOW  is  the  lower  bound  on  the  uncoded 
variable 
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Table  3 . 

Variable  Coding  for  Response  Surface  Methodology 


Variable 

Description 

Uncoded 

Variable 

Name 

Mon-Coded 

Coded 

Variable 

Name 

Coded 

Low 

High 

Low 

High  1 

Number  of  ext. 
vars . 

A 

1.0 

m 

Zi 

1 

■ 

Correlation  of 
ind.  vars. 

B 

imi^n 

m 

Z2 

-1 

■ 

Variance  of 
ext.  vars. 

C 

[HISI 

100 

Zj 

-1 

■ 

Variance  of 
ind .  vars . 

D 

1.0 

100 

Z4 

■ 

■ 

S€unple  Size 

E 

10.0 

20.0 

Zs 

■ 

Variance  of 
error  term 

F 

0.0625 

0.25 

Zs 

-1 

1 

It  seems  reasonable  to  assume  significance  of  individual 
factors  as  well  as  the  significance  of  interactions  between 
factors.  To  insure  that  estimates  for  both  these  main 
factors  and  their  interactions  can  be  accurately  calculated, 
a  full  2*  factorial  design  is  necessary.  To  construct  the 
design  matrix  for  a  full  factorial  design,  the  coded  factors 
are  varied  from  their  low  to  high  settings  with  the  first 
coded  main  factor  being  varied  most  rapidly,  the  second 
varied  next  most  rapidly,  and  so  forth.  The  interaction 
terms  are  simply  the  product  of  the  corresponding  coded  main 
factors.  An  exeunple  of  this  process  using  full  2^  factorial 
design  is  summarized  in  the  table  below. 
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Table  4 . 

Ex€unple  of  Coding  Interaction  Variables 


c 

o 

d 

e 

d 

8 

e 

t 

t 

i 

n 

• 

Zi 

22 

Zj 

Z1Z2 

2,23 

Z2Z3 

-1 

-1 

-1 

1 

1 

1 

-1 

1 

-1 

-1 

-1 

-1 

1 

1 

-1 

1 

-1 

-1 

1 

-1 

1 

1 

1 

-1 

1 

-1 

-1 

-1 

-1 

-1 

1 

1 

-1 

-1 

1 

1 

-1 

1 

-1 

1 

-1 

-1 

-1 

1 

1 

-1 

-1 

1 

-1 

1 

1 

1 

1 

1 

1 

1 

If  a  design  with  less  than  2^  runs  is  used,  information  on 
some  of  the  high  order  interactions  would  be  unobtainable 
(Hansen,  1988:38). 

Results.  The  significant  factors  that  contribute  to 
the  PM  were  selected  using  Stepwise  regression  with  Forward 
Selection  and  Backward  Elimination  (since  the  design  matrix 
for  this  experiment  was  orthogonal).  The  resulting  equa¬ 
tions  indicate  which  factor  or  factor  combinations  were  most 
significant  in  increasing  the  PM,  the  percentage  of  correct 
variables,  for  a  particular  method.  These  equations  are  not 
intended  to  predict  the  percentage  of  correct  variables, 
given  certain  factor  settings.  The  role  of  these  equations, 
however,  is  restricted  to  determining  which  factors  are 
significant  and  how  they  contribute  to  the  percentage  of 
correct  variables  in  a  model.  On  this  basis,  the  four 
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subset  selection  methods  can  be  compared.  Three  similar 
equations  were  discovered  by  Hansen,  one  for  each  of  the _ 
three  criteria  he  studied  (Hansen,  1980:39).  The  analysis 
which  follows  is  based  on  the  assumption  that  the  closer  a 
PM  value  is  to  one,  the  better  a  method's  performance. 

Factors  which  cause  PM  to  become  closer  to  one  are  desir¬ 
able. 

PM  Equations.  Using  the  statistical  package 
STATISTIX  version  4.0  a  2*  full  factorial  design  matrix  was 
created  and  augmented  with  the  PM  vector  (STATISTIX,  1992). 

This  design  matrix  was  then  exported  to  the  SAS  system  where 
a  Stepwise  regression  procedure  was  run,  generating  the 
equations  below,  as  outlined  in  appendix  E  (SAS,  1985).  An 
equation  was  generated  for  each  method  studied  and  shows  how 
that  method's  performance  is  related  to  the  factors  under 
which  it  was  applied. 

Minimum  MSE. 

=  0.78 -0.10  (A)  +0.0023  (P) +0,006  (^  +0.0062  (F) 

+  0.003  (AB)  +0.003  (AF)  -0.003  (PF)  -0.006  (FF)  (25) 

-0.003  (AFF) 


PMgp  -  0.85-0.07  (A)  +0.002  <P)  +0.007  (F)  +0.007  (F) 

+  0.002  (AP)  +0.006  (AF)  +0.003  (PF)  +0.007  (AF)  ^26) 
-0.002  (PF)  -0.005  (FF)  -0.002  (APF)  -0.006  (AFP) 


Minimum  Cp. 

PMcp  *  0.84-0.07  (A)  +0.003(1))  +0.007  (B)  +0.008  (B) 

+  0.002  (Ai?)  +0.006  <AB)  +0.003  (DB)  +0.008  <AF)  (27) 
-0.003  (BB)  -0.005  (BB)  -0.002  (ADB)  -0.006  (ABB) 


Miller's  Method. 

Wus  “  0.88-0.04  (A)  +0. 01(B)  +0.02  (B) 
+  0.01(AB)  +0.008(AB)  -0.008(BB) 

-0.008(BCEB) 


Summary  of  Effects. 


Table  5. 

Main  Factor  Coefficients  of  Effects  by  Method  for  PM 


METHOD-* 


I  FACTOR  I 


A  (ext.  vars.) 


Bfind.  corr 


:  ■  ■  .  2 

C  (exte  a  ) 


D  (ind.  o*) 


E  (sam.  si2e 


«  t _ \ 


F  (error  o  ) 


Intercept  ( p, ) 


Minimum 

Minimum 

Minimum 

Miller's 

MSE 

Method 

-  0.07 


-  0.07 


-  0.04 


+  0.01 


+  0.0023 

+  0.002 

+  0.003 

+  0.006 

+  0.007 

+  0.007 

+  0.0062 

+  0.007 

+  0.008 

+  0.78 

+  0.85 

00 

. 

o 

+ 

+  0.02 


+  0.88 


Table  6 . 

Main  Factor  Effects  by  Method  and  Rank  Order  of  Significance 

for  PM 


FACTOR 


A  (ext.  vars 


B  find,  corr 


C  (ext.  o^) 


D  (ind.  o^) 


E  (sam 


2 

F  (error  a  ) 


Minimum 

Minimum 

Minimum 

Miller's 

MSE 

Sp 

Cp 

Method 

1st 

1st 

1st 

1st 

No  effect 

No  effect 

No  effect 

2nd 

No  effect 

No  effect 

No  effect 

No  effect 

4th 

4th 

4th 

No  effect 

3rd 

2nd 

2nd 

3rd 

2nd 

3rd 

3rd 

No  effect 

All  Four  Methods.  The  following  results 
pertain  to  all  methods: 

(1)  The  fewer  the  number  of  extraneous  va^riables  the 
better  the  performance. 

(2)  Larger  sample  sizes  also  yielded  better  perfor¬ 
mance  . 

(3)  The  variance  of  the  independent  variable  had  little 
effect  on  the  performance  of  any  method. 

Minimum  MSE,  Minimum  Cn.  Minimum  Sn  Method. 


The  following  additional  results  were  observed  for  these 
methods : 

( 1 )  Higher  variances  on  the  independent  variable  yield¬ 
ed  better  results. 

(2)  Higher  variances  on  the  error  term  give  better 
results. 


(3)  They  ware  not  affected  by  the  correlation  of  the 
independent  variables. 

Miller *8  Method.  The  following  additional 
results  were  observed  for  this  method: 

(1)  The  method  did  better  when  the  independent  vari¬ 
ables  are  highly  correlated. 

(2)  It  was  not  affected  by  the  fluctuating  variance  on 
any  term  (independent  or  extraneous  variables  or  the  error 
term) . 

Analysis. 

To  further  assess  the  impact  of  each  factor  (A,  B,  C, 

D,  E,  F)  on  PM  for  a  given  method,  STATISTIX  version  4.0  was 
used  to  produce  Box  and  Whisker  plots  by  indicator  grouping 
(STATISTIX,  1992:96).  Each  PM  value  was  associated  with  one 
of  eight  values  or  indicators,  dividing  it  into  eight  equal 
indicator  groupings.  To  assign  the  indicator  values,  an 
integer  "1"  through  "4"  was  assigned  to  PM  values  according 
to  the  method  it  measured:  1  for  minimum  MSE,  2  for  minimum 
Sp,  3  for  minimum  Cp,  and  4  for  Miller's  method.  Next,  each 
number  was  assigned  either  a  plus  or  minus  sign  depending  on 
the  factor  setting  of  the  factor  under  consideration,  plus 
for  high  values  and  minus  for  low  values.  A  set  of  indica¬ 
tor  values  was  developed  for  each  of  the  six  factors  stud¬ 
ied.  The  six  resulting  plots  reveal  much  about  the  useful- 
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On  the  other  hand.  Miller's  method  was  a  top  performer 
at  either  setting.  Miller's  method,  on  coverage,  outper¬ 
formed  the  ot^ar  three  methods  being  least  affected  by  the 
number  of  extraneous  variables  present.  Since  in  practice, 
the  number  of  extraneous  variables  present  in  a  variable 
pool  is  not  known  (by  definition),  the  consistency  of  Mill¬ 
er's  method  in  dealing  with  an  unknown  number  of  extraneous 
variables  is  highly  desirable. 

When  only  one  extraneous  variable  is  present,  the 
performance  of  Minimum  Sp  and  Minimum  Cp  was  constant  and 
stable,  choosing  the  correct  variable  at  least  91  times  out 
of  100.  Under  these  circumstances,  where  few  extraneous 
variables  were  in  the  pool,  the  performances  of  Minimum  Sp 
and  Minimum  Cp  were  predictable  and  reliable,  though  not  as 
good  as  Miller's  method. 
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Figure  3.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  B  on  PM  by  Method 


Miller's  method  selects  the  highest  percentage  of 
correct  variables  at  either  level  and  is  the  only  method 
significantly  affected  by  an  increase  in  correlation  eunong 
the  truly  significant  predictors.  The  ability  of  Miller's 
method  to  select  correct  variables  actually  increases  as  the 
correlation  between  the  correct  variables  increases.  This 
occurs  because  the  increased  correlation  among  the  correct 
variables  causes  them  to  behave  as  one  variable.  If  any  one 
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correct  variable  is  selected,  it  is  equivalent  to  all  the 
correct  variables  being  selected. 


Figure  4.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  C  on  PM  by  Method  _  . _ _ 


Factor  C,  the  variance  of  the  extraneous  variable,  had 
little  effect  on  any  of  the  four  methods.  The  median  of  the 
MSE  method  improved  slightly  with  an  increase  in  variance  of 
the  extraneous  variables  while  the  median  of  Miller's  method 
decreased  slightly. 

The  Minimum  Sp,  Minimum  Cp,  and  Minimum  MSE  methods  lag 
behind  Millers  method  and  show  a  greater  variability. 


47 


Clearly  the  Minimum  MSE  method  selects  the  smallest  percent¬ 
age  of  correct  variables. 


Effect  of  Factor  D  on  PM  by  Method 


Bi^k/law  hctor  D  (kagpiogi  by  Metlnl 
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Figure  5.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  D  on  PM  by  Method 


Again,  this  plot  for  | 
shows  that  the  variance  of 
effect  on  the  percentage  o 
of  the  four  methods. 


|f actor  D,  like  that  of  factor  C, 
the  correct  variables  has  little 
correct  variables  chosen  for  any 
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Figure  6.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  E  on  PM  by  Method 

Increasing  the  sample  size,  factor  E,  increases  the 
median  performance  of  Miller's  method  by  5  percent.  The 
Minimum  MSE  method  also  improves  slightly  as  sample  size 
increases.  The  Minimum  Sp  and  Minimum  Cp  method  are  not 


effected. 


Effect  of  Factor  F  otx  PM  by  Method 
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Figure  7.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  F  on  PM  by  Method 


Increasing  the  variance  of  the  error  term,  factor  F, 
has  little  effect  on  any  of  the  four  methods.  The  median 
performances  of  the  Minimum  MSE  method  and  Miller's  method 
is  slightly  increased  at  the  higher  factor  levels. 


Theoretical  Mean  Square  Error  of  Prediction  (TMSEP)  as  a 


Performance  Measure  of  Model  Accurac 


Justification.  Thus  far,  analysis  has  been  limited  to 
studying  the  effects  which  varying  factors  have  on  a  meth¬ 
od's  ability  to  select  the  correct  variables.  PM,  however. 


when  employed  as  an  index  for  comparing  different  selection 
methods,  favored  techniques  which  select  models  with  the 
highest  percentage  of  correct  variables.  Although  models 
with  a  high  percentage  of  correct  predictors  are  desirable, 
methods  which  select  such  models  may  do  so  by  selecting  more 
variables  overall.  In  such  models,  the  ratio  of  extraneous 
variables  to  all  the  variables  may  be  small,  but  the  abso¬ 
lute  number  of  extraneous  variables  may  be  larger  than 
desired  simply  because  of  the  sheer  number  of  variables 
selected.  Comparing  techniques  on  the  basis  of  PM  may  favor 
methods  which  create  these  larger  models  rather  than  those 
which  create  parsimonious  models.  Therefore,  a  different, 
more  absolute  performance  measure  was  adopted  to  compare 
selection  techniques  in  terms  of  how  close  the  selected 
models  response  value  is  to  the  true  response  value.  A 
comparison  of  how  accurat-ly  each  technique  performs  can  be 
accomplished  using  another  performance  measure  known  as 
Theoretical  Minimum  Mean  Square  Error  of  Prediction  (TMSEP) 
and  defined  by: 


TMSEP^^y  = 


t«i _ 


(29) 


where 

TMSEPm  is  the  TMSEP  for  data  set  k  using  the 
subset  selection  technique  M 

Yt  is  the  theoretical  conditional  mean  of  Y  calcu¬ 
lated  from  the  underlying  data  generation  model 
(1)  and  the  data  set  k. 


is  the  predicted  value  of  Y  using  the  model 
selected  by  applying  method  M  to  data  set  k 

nk  is  the  sample  size  of  data  set  k. 

Pu  is  the  number  of  predictors  in  the  model  se¬ 
lected  by  applying  method  M  to  data  set  k 

TMSEP  is  a  good  choice  for  an  inter-technique  compari¬ 
son.  It  compares  each  method's  model  at  a  particular  data 
set  to  the  theoretical  model  which  generated  the  original 
data.  In  theory,  TMSEP  directly  measures  how  well  the 
predicted  model  explains  the  variations  in  the  original 
data.  Furthermore,  the  TMSEP  criterion  is  a  variation  of 
Mean  Squared  Error  Prediction  (MSEP),  a  statistic  that  has 
received  much  praise  in  the  literature.  TMSEP  and  MSEP  both 
calculate  the  squared  difference  between  the  predicted  value 
of  Y  and  the  actual  value  of  Y  and  adjust  the  value  for  the 
degrees  of  freedom.  TMSEP  differs  from  MSEP,  however,  in 
its  calculation.  TMSEP  is  calculated  by  squaring  the 
difference  between  the  theoretical  Y  value  (the  response 
from  the  underlying  data  generation  equation,  excluding  the 
error  term)  and  the  predicted  Y  value  generated  by  the  model 
constructed  using  variable  selection  procedure,  M.  The 
resulting  value  is  the  Theoretical  MSEP  or  TMSEP.  Since  the 
TMSEP  is  based  on  MSEP  which  has  received  considerable 
praise  in  the  literature  during  the  past  decade,  the  TMSEP 
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is  also  considered  a  most  promising  criterion  (Hansen, 
1988,43-45).. 

In  defending  the  credibility  of  TMSEP,  Hansen  notes 

that  at  first  glance,  TMSEP  appears  to  unfairly  favor  the 

Minimum  Cp  and  Minimum  Sp  criteria  because  both  are  based  in 

minimum  MSEP.  Furthermore,  one  might  falsely  assume  that 

since  the  Sp  criterion  and  TMSEP  are  based  on  the  regressors 

being  randomly  generated,  the  TMSEP  would  favor  the  Minimum 

Sp  method.  Hansen  clearly  shows  this  is  not  the  case. 

It  is  assumed  when  calculating  the  Sp  and  Cp  sta¬ 
tistics  that  all  relevant  variables  are  included 
in  the  variable  pool.  It  is  also  assumed  that  the 
variable  pool  does  not  contain  extraneous  vari¬ 
ables.  In  this  study  both  of  these  assumptions 
are  violated.  Therefore,  it  is  possible  that 
either  the  MSE,  Cp[,  or  Miller]  criterion  could 
outperform  the  Sp criterion.  (Hansen,  1988:46) 

Calculating  TMSEP.  The  equation  presented  thus  far  to 

calculate  TMSEP  does  so  one  data  set  at  a  time.  Recall  that 

the  generated  data  consists  of  64  design  points  each  of 

which  is  made  up  of  60  data  sets.  In  order  to  compare  each 

of  Ithe  four  variable  selection  techniques,  the  TMSEP  must 

soml^how  be  calculated  for  each  technique  at  each  design 

point.  Although  generating  the  TMSEP  for  each  data  set  is  a 

starting  point,  a  slightly  different  TMSEP  equation  is 

nece^ary  to  generate  the  aggregate  TMSEP  at  each  design 

point . 
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starting  with  the  original  equa 


TUSEP^.^ 


5 


Then,  applying  algebra  yields: 


TMSEP^^ 


Hansen  assumed  a  Chi  Square  dis 


E 


c-i 


o® 


Then,  it  follows  from  equation 


(■Hjt-pj,,  J  mggp 
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Based  on  the  theorem  that  the  sum  of  independent  %  vari¬ 
ables  is  also  %,  we  have: 


r -  I 


Fa[  o*  J  fel  o* 


(0 


ifel  o 


«0  »jc 


EE 

^Li=i — ^ - -  3j’r«*  *•  1 


L  <  7 
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Therefore,  the  formula  for  calculatj^g  TMSEP  is: 

CO 

TMSEP^.h- 

where  TMSEPo,)i  is  the  TMSEP  at  design  point  D 
using  method  M 

Minimum  MSE.  Minimum  Sp,  and  Minimum  Cp  Methods. 
Appendix  C  outlines  the  data  processing  to  calculate  TMSEP 
for  each  design  point  for  the  above  three  methods.  The 
processing  is  similar  to  that  performed  at  each  design  point 
during  subset  selection  for  each  method.  The  processing 
differs  in  that  for  each  model  selected,  the  coefficients  of 
regression  are  estimated  by  SAS.  The  FORTRAN  program  uses 
these  coefficients  and  the  original  data  to  generate  TMSEP 
for  each  design  point  and  each  method. 

Miller's  Method.  Appendix  D  outlines  the  data 
processing  to  calculate  TMSEP  for  each  design  point  for 
Miller's  method.  A  FORTRAN  progreun  creates  a  SAS  program 
file  to  calculate  the  coefficients  of  regression  for  each 
model.  This  SAS  program  is  executed  and  the  output  is 
filtered  and  formatted  by  yet  another  FORTRAN  program.  A 
third  FORTRAN  progreun  processes  this  output  along  with  the 
data  sets  and  calculates  the  TMSEP  for  each  method  and 
design  point. 


Experiment.  An  experiment  identical  to  the  one  run  for 
PM  was  run  to  determine  the  significant  factors  for  TMSEP. 
Basically,  TMSEP  was  substituted  for  PM  in  the  experimental 
design  and  then  the  experiment  was  run  as  before,  using  the 
SAS  Stepwise  procedure. 

Results.  The  same  comments  that  applied  to  PM  apply  to 
TMSEP,  with  one  notable  exception.  Whereas  with  PM,  values 
-•  1'  were  desirable,  with  TMSEP  values  0*  are  the  target. 

TMSEP  Equations.  These  equations  were  generated 
in  exactly  the  same  manner  as  the  PM  equations 


Minimum  MSE. 

TMSEP^  =  22. 15-1. 76(A)  -16.91(5)  +21.6(5)  +3.04(5) 

+  1.4(AB)  -1.75<AD)  +0.74(A5)  -16.57(55) 

(40) 

-2.4(55)  +3.03(55)  +1. 38  (A55)  -0.65(A55) 

+  0 .74  (A55) -2 . 36  (555)  -0 . 64  (A555) 


Minimum  Sp.  " 

TMSEPgf  -  23. 22 -1.54  (A)  -17.76(5)  +22.65(5)  +2.29(5) 

+  1. 27  (AS) -1.52  (A5) -17.4(55) -1.8(55)  (41) 

+  2 . 3  (55) -■  1 . 24  ( AS5)  - 1 . 77  (555) 
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Minimum  Cp. 

TMSEPcp  =  37.51-0.84  (A)  -25.72  (B>  +36.69  (D)  +5.86  (B) 
+  0.79  <AS)  -0.82  (AB)  -25.19  <BZ?>  -4.08(BB) 
+  5 .77  (BE)  +0 .77  (ASD)  -4 . 0  (BDB) 


Miller's  Method. 


TVSEPutt,^  =  33.02-26.22(5)  +32.26  (D)  -3.93(5) 


-25.69  (BD)  +3.32  (B5)  -2.  Si  ink 

i 

*2.21 {BD^ 


Summary  of  Effects. 


Table  7 .  | 

Main  Factor  Coefficients  of  Effects  by  Method  fbr  TMSEP 


METHOD--*- 


i  FACTOR  I 


A  (ext.  vars.) 


C  ( ext .  a  ) 


D  ( ind .  ) 


E  (scun.  size 


2 

F  (error  o  ) 


Intercept  (n) 


Minimum 

MSE 


Minimum 

Sp 


Minimum 

Cp 


Miller's 

Method 


-  1.76 


-  16.91 


-  1.54 


-  17.76 


-  0.84 


-  25.72 


-  26.22 


+  3.04 


+  22.15 


+  22.65 


+  2.29 


+  23.22 


+  36.69 


+  5.86 


+  37.51 


+  32.26 


-  3.93 


+  33.02 


Main  Factor  Effects  by  Method  and  Rank  Order  of  Significance 

for  TMSEP 


METHOD-.-.. 

Minimum 

MSE 

^  ^ ■/ 

Minimum 

Sp 

Minimum 

Cp 

Miller's 

Method 

1  FACTOR  1 

A  (ext.  vars. ) 

4th 

4th 

4th 

No  effect 

■■■■■■■■■■ 

2nd 

2nd 

2nd 

2nd 

C  (ext.  0^) 

No  effect 

No  effect 

No  effect 

No  effect 

D  (ind.  o^) 

1st 

Ist 

1st 

1st 

E  ( sam.  size) 

3rd 

3rd 

3rd 

3rd 

F  (error  o  ) 

No  effect 

No  effect 

No  effect 

No  effect 

All  Four  Methods.  The  following  results 
pertain  to  all  methods: 

( 1 )  The  higher  the  correlation  among  the  independent  or 
correct  variables,  the  better  the  performance. 

(2)  Lower  variances  in  the  independent  or  correct 

variables  yielded  better  performance. 

Minimum  MSE,  Minimum  Cp,  Minimum  Sp  Method. 

The  foTldwing  results  were  additionally  observed  for  these 
methods : 

(1)  The  higher  the  number  of  extraneous  variables,  the 
Closer  the  response  value  is  to  its  true  theoretical  value. 

(2)  Smaller  sample  sizes  give  better  results. 

Miller's  Method.  The  following  additional 
results  were  observed  for  this  method: 

(1)  Adding  extraneous  variables  causes  improvement  of 
the  TMSEP  for  a  model. 
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(2)  Better  performance  is  obtained  with  larger  sample 
sizes. 

Analysis .  As  with  PM,  Box  and  Whisker  plots  were  em¬ 
ployed  to  further  assess  the  impact  of  each  factor  (A,  B,  C, 
Df  Ef  F)  on  TMSEP  for  a  given  method.  STATISTIX  4.0  was 
also  used  to  produce  these  Box  and  Whisker  plots  by  forming 
the  indicators  in  the  same  manner  as  before.  A  set  of 
indicator  values  are  created  for  each  of  the  six  factors 
studied.  The  six  resulting  plots  revealed  much  about  the 
ability  ot  each  subset  selection  technique  to  create  a  model 
close  to  the  actual  model. 
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Effect  of  Factor  A  on  TMSEP  by  Hethod 


Kfii/ts*  hctor  i  Qranpmp  by  )fetM 
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Figure  8.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  A  on  TMSEP  by  Method 

The  number  of  extraneous  variables  involved  had  very 
little  impact  on  how  close  a  method  came  to  selecting  the 
absolutely  correct  model.  Of  the  four  methods  studied, 
however,  MSE  appears  to  perform  best. 
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Effect  of  Factor  B  on  TTilSEP  by  Method 


Figure  9.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  B  on  TMSEP  by  Method 


Clearly  the  eunount  of  correlation  between  the  correct 
variables  has  a  great  effect  on  model  accuracy.  When  the 
correct  variables  are  highly  correlated/  one  contains  almost 
all  of  the  information  contained  in  all  four  of  them  (in¬ 
cluding  the  one  omitted  from  the  pool). 


/ 
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Effect  of  Factor  C  on  TMSEP  b;  Method 


Figure  10.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  C  on  TMSEP  by  Methold 


Factor  C,  the  variance  of  the  extraneous  variables,  has 
no  effect  on  the  accuracy  of  any  method.  If  the  focus  was 
on  selecting  the  correct  variables,  it  follows  that  a  change 
in  the  extraneous  variables  would  have  little  effect. 
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Figure  11.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  D  on  TMSEP  by  Method 


The  variance  of  the  extraneous  variables,  factor  D, 
effects  the  performance  of  all  four  methods.  Minimum  MSE 
and  Minimum  Sp  methods  appear  to  be  more  affected  than  Mini 
mum  Cp  and  Miller's  methods. 


Effect  of  Factor  B  by  TMSEP  by  Method 
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Figure  12.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  E  on  TMSEP  by  Method 

Increasing  the  seunple  size,  factor  E,  tends  to  increase 
the  variance  in  the  Minimum  MSE,  Minimum  Sp,  and  Minimum  Cp 
methods.  Miller's  method,  however,  becomes  slightly  more 
consistent. 
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Figure  13.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  F  on  TMSEP  by  Method 

The  variance  of  the  error  term,  factor  F,  has  no  appar¬ 
ent  effect  on  the  accuracy  of  the  four  methods  studied.  All 
the  method  were  able  to  filter  out  the  white  noise  equally 
well. 


<» 
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V.  Conclusions  and  Recommendations  for  Further  Research 

Conclusion 

Objective.  The  objectives  of  this  research  were:  (1) 
identify  some  promising  least  squares  selection  procedures 
discussed  in  the  literature,  (2)  introduce,  implement,  and 
study  a  variable  selection  method  proposed  by  Alan  J.  Mill¬ 
er,  and  (3)  make  an  extension  of  Hansen's  research  by  com¬ 
paring  the  methods  he  examined:  Minimum  MSB,  Minimum  Sp, 
and  Minimum  Cp,  with  Miller's  method. 

Techniques  Studied.  The  Minimum  MSB,  Minimum  Sp,  and 
Minimum  Cp  variable  selection  techniques  have  received  much 
praise  in  the  past  20  years.  Due  to  the  similarity  to  the 
Maximum  criterion  and  its  adjustment  for  degrees  of 
freedom.  Minimum  MSB  was  considered  the  favored  technique 
fifteen  years  ago.  More  recently  Minimum  Sp  and  Minimum  Cp, 
both  of  which  are  based  on  MSBP,  have  received  the  majority 
of  the  praise.  Of  the  two.  Minimum  Sp  is  the  more 
selection  method  because  it  is  designed  for  random 
sors  (Hansen,  1988:59). 

Compared  to  the  three  well-known  techniques 
above.  Miller's  method  was  obscure  and  untested.  Alliter- 
ature  search  revealed  only  Miller's  original  referenice  to 
the  procedure.  This  research  has  compared  and  contrasted 
Miller's  method  with  the  well-accepted  techniques,  Minimum 
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MSE,  Minimum  Sp,  and  Minimum  Cp,  and  thereby  defined  its 
role  eunong  current  variable  screening  techniques. 

Methodology.  To  facilitate  a  comparative  ..nalysis  of 
Miller's  method  and  the  other  methods.  Response  Surface 
Methodology  was  employed  with  two  performance  measures.  The 
■  first,  designated  PM,  measured  the  percentage  of  correct 
variables  in  a  model.  The  second.  Theoretical  Mean  Squared 
Error  of  Prediction  (TMSEP),  measured  the  predictive  error 
between  the  model  selected  and  the  theoretical  model.  A  2^ 
full  factorial  design  was  setup,  yielding  the  64  high/low 
combinations,  or  design  points,  of  the  six  factors  being 
studied.  Using  Hansen's  data,  which  had  been  generated  with 
60  replications  at  each  design  point,  both  PM  and  TMSEP  were 
calculated  for  each  subset  selection  method  at  each  design 
point.  The  SAS  Stepwise  procedure  was  used  to  select  sig¬ 
nificant  factors  or  factor  combinations  at  the  «  =  0.01 
level  and  to  generate  a  linear  equation  for  each  combination 
of  performance  measure  and  selection  method.  Four  of  these 
eight  equations  revealed  what  each  of  the  six  factors  and 
their  combinations  contributed  toward  improving  the  percent¬ 
age  of  correct  variables  (maximizing  PM)  in  a  model  and  the 
other  four  examined  how  the  saune  factors  related  to  minimiz¬ 
ing  the  error  between  the  modeled  response  and  the  theoreti¬ 
cal  response  (minimizing  TMSEP).  STATISTIX  4.0  was  then 
used  to  produce  Box  and  Whisker  plots  by  performance  measure 
and  method.  These  plots  revealed  factor  effects  and  provid- 
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ed  a  graphical  analysis  of  variance  on  performance  measures 
by  method  and  factor  settings. 

Two-Stage  Variable  Selection  Technique.  The  data  used 
in  this  thesis  attempted  to  simulate  real  world  data. 
Extraneous  variables  were  added  and  one  of  the  significant 
predictors  was  totally  dropped  from  consideration  after 
generating  the  data.  In  light  of  these  tough,  inherent  data 
problems,  it  was  suspected  from  the  beginning  of  this  re¬ 
search  effort  that  a  single  selection  method  may  not  be 
effective  at  both  screening  out  the  extraneous  variables  and 
selecting  the  final  model.  Therefore,  two  performance 
measures,  PM  and  TMSEP,  were  examined  because  they  rate 
selection  methods  from  different  vantage  points.  A  selec¬ 
tion  technique  which  rated  highly  under  PM  would  perform 
well  as  a  screening  method  prior  to  final  variable  selec¬ 
tion.  During  the  screening  process  the  objective  is  to 
select  the  greatest  number  of  significant  variables  (or 
correct  or  true  variables)  while  rejecting  any  extraneous 
ones.  PM  measured  how  well  each  method  accomplished  this. 

On  the  other  hand,  a  selection  technique  which  rated  highly 
under  TMSEP  would  perform  the  final  variable  selection 
process  well.  During  the  final  selection  process,  a  set  of 
likely  predictors  is  exeunined  and  the  final  subset  selected. 
One  hopes  that  this  final  subset  of  predictors  has  a  re¬ 
sponse  close  to  that  of  the  theoretically  correct  set  of 
predictors.  TMSEP  measured  the  performance  of  each  method 
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in  this  regard.  Note  that  PM  and  TMSEP  were  calculable  only 
because  the  data  for  this  research  was  generated  by  a  known 
model.  In  practice/  PM  and  TMSEP  cannot  be  calculated.  It 
was  the  intention  of  this  research,  therefore,  to  observe 
the  performance  of  the  four  variable  selection  techniques  in 
question  under  controlled  conditions  and  to  note  the  condi~ 
tions  under  which  they  perform  best. 

In  a  screening  situation  where  PM  would  apply,  all  four 
PM  equations  and  a  comparison  of  their  regression  factor 
coefficients  indicated  that  the  number  of  extraneous  vari¬ 
ables  (factor  A)  was  the  most  significant  factor,  sometimes 
by  a  difference  as  much  as  two  magnitudes.  Box  plots  of  PM 
for  factor  A  also  revealed  that  Miller's  method  had  the 
highest  median  PM  value.  The  equation  for  PM«iu.ers  reveals 
why  this  occurred.  PMhillers  had  the  highest  intercept  value 
and  the  number  of  extraneous  vari  hies  reduced  the  perfor¬ 
mance  measure  by  less  than  half  the  eunount  the  other  PM 
equations  did  for  the  other  methods.  Obviously,  when  se¬ 
lecting  the  independent  or  correct  variables  from  a  variable 
pool  containing  extraneous  variables.  Miller's  method  was 
the  method  least  affected  by  the  presence  of  extraneous 
variables.  Thus  Miller's  method  is  the  best  technique  for 
screening. 

Once  screened,  the  variable  pool  is  ready  for  final 
model  selection.  As  stated  previously,  a  me*'hod's  perfor¬ 
mance  during  this  final  selection  stage  is  best  gauged  by 


TMSEP.  TMSEP  was  primarily  affected  by  two  factors;  the 
variance  of  the  independent  or  correct  variables  (factor  D) 
and  the  correlation  among  the  same  variables  (factor  B),  as 
the  all  TMSEP  equations  reveal.  The  regression  coefficients 
of  factors  B,  D,  and  the  BD  interaction  were  a  magnitude 
larger  than  any  other  coefficients.  Closer  exeunination  of 
the  TMSEP  equations  showed  that  when  factor  D  (variance  of 
the  correct  variable)  was  at  its  low  setting,  factor  B  (cor¬ 
relation  of  the  correct  variables)  caused  about  the  same 
improvement  (decrease)  of  TMSEP  at  its  high  and  low  levels. 
When  factor  D  is  set  high,  however,  the  low  setting  of 
factor  B  worsens  (increases)  TMSEP  and  the  high  settings  of 
factor  B  improves  (decreases)  TMSEP.  The  following  analysis 
graphically  depicts  this  BD  interaction  using  B+D+BD  to 
calculate  the  weights  in  each  quadrant.  This  explains  the 
importance  of  the  BD  interaction. 
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Figure  14.  Graphical  Analysis  of  the  BD  Interaction  by 
Method 


Box  plots  for  factors  B  and  D  show  that  the  Minimum  MSE 
method  had  the  best  median  TMSEP,  followed  closely  by  the 
Minimum  Sp  method.  Furthermore,  the  following  box  plot  for 
the  BD  interaction  factor  confirms  that  the  Minimum  MSE 
method  would  perform  best  as  a  final  selection  technique. 


Effect  of  Factor  BD  on  TMSEP  bj  Method 
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Figure  15.  Box  and  Whisker  Plots  Showing  the  Effect  of 
Factor  BD  on  TMSEP  by  Method 


This  research  proposes  a  two-stage  variable  selection 
technique.  Miller's  method  is  used  to  first  screen  the 
variable  pool  and  reduce  the  number  of  extraneous  variables. 
Next  the  Minimum  MSE  method  is  used  to  select  the  model  from 
this  reduced  variable  pool. 

Factor  C,  the  variance  of  the  extraneous  variables,  had 
little  or  no  effect  on  either  PM  or  TMSEP.  It  was  the  only 
factor  which  had  no  impact  throughout  this  research  effort. 


73 


Neither  did  it  appear  in  any  of  the  PM  or  TMSEP  equations. 
Based  on  these  results,  this  factor  could  be  dropped  from 
further  consideration. 

Another  useful  result  of  this  research  is  the  compari¬ 
son  of  the  two  MSEP  criteria:  Minimum  Sp  and  Minimum  Cp. 

A  great  deal  of  praise  has  been  given  to  the  Minimum  Sp 
criterion  in  the  past  15  years.  It  was  identified  as  one  of- 
the  most  promising  methods  when  the  regressors  are  random 
and  one  desires  to  minimize  the  mean  square  error  of  predic¬ 
tion.  The  minimum  Cp  criterion  has  also  received  praise  for 
minimizing  mean  square  error  of  prediction,  but  its  useful¬ 
ness  is  limited  to  cases  where  the  regressors  are  fixed. 

Some  have  recommended  that  the  Minimum  Cp  criterion  not  be 
used  in  practice. 

The  results  of  this  thesis  indicate  that  the  Minimum  Sp 
method  outperformed  the  Minimum  Cp  method  at  every  factor 
level,  using  bcjth  PM  and  TMSEP.  No  evidence  was  found  to 
refute  the  assertion  that  the  Minimum  Cp  criterion  should 
not  be  used  in  |)ractice.  In  fact,  this  research  effort 
supports  using  Minimum  Sp  method  instead  of  the  Minimum  Cp 
method,  thereby  improving  the  selection  process. 

Most  other  simulations  have  dealt  with  the  number  of 
correct  variables  chosen  of  those  available.  No  provisions 
were  made  for  circumstances  in  which  a  significant  regressor 
is  not  included  in  the  variable  pool.  Therefore,  techniques 
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praised  as  good  variable  selection  techniques  may  not  be  as 
appealing  as  originally  thought.  This  appears  to  be  the 
case  with  Minimum  Cp.  It  should  be  noted,  however,  that 
Mallows  Cp  method  (Cp-close-to-p)  is  not  the  Scune  as  the 
Minimum  Cp  method  (Cp-close-to-zero) .  This  Mallows  Cp  meth¬ 
od,  as  originally  proposed,  was  not  studied  in  this  thesis. 

Recommendations  for  Further  Research 

This  research  effort  lends  itself  to  several  follow-on 
studies.  The  methodology  established  by  Hansen  and  the 
computer  programming  groundwork  in  this  research  project 
make  embellishments  and  the  use  of  more  complex  model  a 
feasible  task. 

One  area  which  leads  to  further  research  deals  with 
expanding  the  number  of  factors  under  consideration.  This 
research  effort  studied  six  factors,  but  many  more  could  be 
added.  The  response  surface  region  could  be  expanded  to 
include  negative  correlation,  larger  sample  sizes,  and  the 
spread  of  the  variance  on  the  independent  or  correct  vari¬ 
ables.  The  factors  studied  could  also  include  an  indicator 
variable  to  keep  track  of  the  effect  of  dropping  a  signifi¬ 
cant  variable.  That  is,  by  including  a  variable  to  keep 
track  of  the  difference  between  the  full  model  and  a  model 
where  a  variable  is  dropped,  one  could  quantify  the  effects 
of  failing  to  collect  data  on  all  the  significant  variables. 
This  research  only  collected  information  on  the  effects  of 
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dropping  a  variable  and  it  was  assumed  that  if  all  the 
variables  were  present  the  techniques  studied  would  perform 
better.  However,  to  gain  a  better  understanding  of  Miller's 
method,  it  would  be  worthwhile  to  quantify  the  effects  of 
not  including  all  significant  variables  in  the  variable 
pool.  To  implement  this,  factor  C  (the  variance  of  the 
extraneous  variables)  which  had  no  effect,  could  be  replaced 
with  the  indicator  variable  described  above.  Thus,  the 
information  desired  could  be  gained  without  increasing  the 
size  of  the  experimental  design. 

Further  research  could  also  be  done  to  address  the 
question  of  which  screening  and  final  selection  method 
combinations  work  best  together  and  under  what  circumstanc¬ 
es.  The  four  methods  studied  in  this  thesis  could  generate 
16  screening  and  final  selection  method  combinations.  Some 
of  these  combinations  may  be  eliminated  a  priori,  but  the 
rest  could  be  studied  either  under  the  original  six  factors 
used  in  this  thesis  or  under  an  expanded  set  of  factors. 

The  number  of  methods  considered  could  also  be  increased. 

One  method  which  could  be  added  is  Mallows  Cp,  as  the  method 
was  originally  set  forth.  This  would  allow  a  comparison 
between  Miller's  method  and  other  variable  selection  tech¬ 
niques  not  studied  in  this  thesis. 

This  thesis  effort  has  implemented  a  promising  new 
variable  selection  technique:  Miller's  method.  Additional- 
ly,  by  comparing  its  performance  with  three  well  tested 
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methods,  this  research  has  served  to  suggest  a  possible  role 
for  Miller's  method  eimong  the  many  selection  techniques. 

The  results  of  this  research  indicate  that  Miller's  method 
may  be  most  effective  when  used  as  a  screening  method  prior 
to  final  variable  selection. 
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and  TMSEP 


Appendix  F;  A  Glossary  of  Input/Output  Data  Files 


Used  Throughout  All  Sections 


Ol.dat,  02.dat, ... ,64. dat 

64  specifically  generated  data  files,  one  file  for  each  of 
the  64  permutations  of  the  six-factors  analysis 

Ol.dat,  03.dat, ... ,63. dat 

Of  the  64  files,  these  are  the  ones  with  one  extraneous 
variable 

02 . dat ,  04 . dat ,..^,64. dat 

Of  the  64  files,  these  are  the  ones  with  three  extraneous 
variables 

Temp.dat 

Scratch  file  used  to  pass  large  amounts  of  data  between 
FORTRAN  main  routines  and  their  associated  subroutines. 
Always  contains  temporary  data  generated  by  the  most  recent¬ 
ly  executed  FORTRAN  program. 


Calculating  PM  for  MSB.  SP  and  CP  methods 
Error l_all . lis 

Listing  generated  by  SAS  program  Errorl_all.sas.  Contains 
output  from  the  procedure  RSguare  (options  MSE,  SP,  CP)  run 
on  1920  data  sets  with  one  extraneous  variable. 

Error 3_all . lis 

Listing  generated  by  SAS  program  Error l_all .sas .  Contains  - 
output  from  the  procedure  RSquare  (options  MSE,  SP,  CP)  run 
on  1920  data  sets  with  three  extraneous  variables. 

Errorl_all.dat 

Output  from  the  FORTRAN  subroutine  Countl.for.  Contains  the 
selected  model  according  to  the  MSE,  SP,  and  CP  methods  for 
each  of  the  1920  data  sets  with  one  extraneous  variable. 

Error3_all.dat 

Output“£rom  the  FORTRAN  subroutine  Countl.for.  Contains  the 
selected  model  according  to  the  MSE,  SP,  and  CP  methods  for 
each  of  the  1920  data  sets  with  three  extraneous  variables. 


PMl.dat 

Output  from  the  FORTRAN  subroutine  Countl.for.  Contains 
performance  measures  at  each  of  the  32  odd  design  points 
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(1,3,..., 63)  for  the  MSE,  SP,  and  CP  methods  of  variable 
selection. 

PM3.dat 

Output  from  the  FORTRAN  subroutine  Count3.for.  Contains 
performance  measures  at  each  of  the  32  even  design  points 
(2, 4,..., 64)  for  the  MSE,  SP,  and  CP  methods  of  variable 
selection. 


Calculating  PM  for  Miller's  method 
Stepll_all. lis 

Listing  generated  by  SAS  program  Stepll_all.sas.  Contains 
output  from  the  procedure  Stepwise { Forward  Selection)  run  on 
960  data  sets  (1,3,. ..,31)  with  one  extraneous  variable. 

Stepl3_all.lis 

Listing  generated  by  SAS  program  Stepll_all.sas.  Contains 
output  from  the  procedure  Stepwise (Forward  Selection)  run  on 
960  data  sets  (33,35, ... ,63)  with  one  extraneous  variable. 

Step31_all.lis 

Listing  generated  by  SAS  program  Step31__all .  sas .  Contains 
output  from  the  procedure  Stepwise ( Forward  Selection)  run  on 
480  data  sets  (2, 4,..., 16)  with  three  extraneous  variables. 

Step32_all . lis 

Listing  generated  by  SAS  program  Step32_all.sas.  Contains 
output  from  the  procedure  Stepwise (Forward  Selection)  run  on 
480  data  sets  ( 18, 20, . . . , 32 )  with  three  extraneous  vari¬ 
ables. 

Step33_all.lis 

Listing  generated  by  SAS  program  Step33_all.sas.  Contains 
output  from  the  procedure  Stepwise (Forward  Selection)  run  on 
480  data  sets  (34,36, ... ,48)  with  three  extraneous  vari¬ 
ables. 

Step34_all . lis 

Listing  generated  by  SAS  program  Step34_all.sas.  Contains 
output  from  the  procedure  Stepwise ( Forward  Selection)  run  on 
480  data  sets  (50,52, ... ,64 )  with  three  extraneous  vari¬ 
ables  . 


Stepl_Input.dat 

Input  data  file  for  FilStepCount . for .  Contains  the  names  of 
the  SAS  listing  files  (from  data  sets  with  one  extraneous 
variable)  that  FilCount.for  is  to  process. 
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Step3_Input.dat 

Input  data  file  for  FllStepCount. for.  Contains  the  n2unes  of 
the  SAS  listing  files  (from  data  sets  with  three  extraneous 
variaoles)  that  FilCount.for  is  to -process. 

Stepl_all . dat 

Generated  by  FORTRAN  subroutine  StepCountl . for .  Contains 
the  model  selected  via  Miller's  method  for  each  of  1920  data 
sets  with  one  extraneous  variable. 

Step3_all . dat 

Generated  by  FORTRAN  subroutine  StepCount3 . for .  Contains 
the  model  selected  via  Miller's  method  f  each  of  1920  data 
sets  with  three  extraneous  variables. 

PMStepl.dat 

Output  from  the  FORTRAN  subrouti’-a  StepCountl .  for .  Contains 
performance  measures  at  each  of  the  32  odd  design  points 
(1,3,...,63)  for  the  Miller's  method  of  variable  selection. 

PMStep3.dat 

Output  from  the  FORTRAN  subroutine  StepCount3 . for .  Contains 
performance  measures  at  each  of  the  32  even  design  points 
(2 ^4,.../ 64)  for  the  Miller's  method  of  variable  selection. 


Stepwise  Analysis  using  PM  for  each  method 
PM. dat 

Output  from  the  statistical  analysis -program  STATISTIX  4.0. 
Contains  the  design  point  four  ^olumns  of  PMs  (one  for  each 
method)  augmented  with  a  full  2^  factorial  design  matrix. 
This  file  is  then  used  as  input  to  the  SAS  program  PM.sas. 

PM.lis  ^  _ _ 

Listing  file  generated  by  the  SAS  program  PM.sas.  Contains 
the  complete  analysis  from  the  procedure  Stepwise.  Attempts 
a  best  fit  for  each  method's  PM  as  a  linear  function  of  the 
six  factors  studied  and  their  interactions. 


Calculating  TMSEP  for  MSB,  SP  and  CP  methods 
TMSEPl_all.lis 

Listing  generated  by  SAS  program  TMSEP l_all.sas.  Contains 
output  from  the  procedure  RSquare  (options  MSE,  SP,  CP,  and 
B)  run  on  1920  data  sets  with  one  extraneous  variable. 

TMSEP3  all.lis 
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Listing  generated  by  SAS  progrann  TMSEP3_all.sas.  Contains 
output  from  the  procedure  RSquare  (options  MSE,  SP,  CP,  and 
B)  run  on  1920  data  sets  with  one  extraneous  variable. 

TMSEPl.dat 

Generated  by  FORTRAN  subroutine  TMSEPl.for.  Contains  the 
TMSEPs  for  the  32  odd  design  points  (1,3,..., 63)  with  one 
extraneous  variable. 

TMSEP3.dat 

Generated  by  FORTRAN  subroutine  TMSEP3.for.  Contains  the 
TMSEPs  for  the  32  even  design  points  (2, 4,,.., 64)  with  three 
extraneous  variables. 


Calculatj ng  TMSEP  for  Miller's  method 
Mi llerlBeta . sas 

Generated  by  FORTRAN  program  MillSAS.for.  This  is  a  SAS 
input  program  design  to  calculate  the  constant  and  the 
coefficients  of  regression  for  each  of  the  1920  models 
selected  using  Miller's  method  and  data  sets  with  one  extra¬ 
neous  variable. 

Miller3Beta . sas 

Generated  by  FORTRAN  progrcim  MillSAS.for.  This  is  a  SAS 
input  prograun  design  to  calculate  the  constant  and  the 
coefficients  of  regression  for  each  of  the  1920  models 
selected  using  Miller's  method  and  data  sets  with  three 
extraneous  variables . 


Miller IBeta.lis 

Listing  file  generated  by  the  SAS  program  MillerlBeta.sas. 
Contains  the  unformatted  and  unfiltered  data  on  the  constant 
and  the  coefficients  of  regression  for  each  of  the  1920 
models  selected  using  Miller's  method  and  data  sets  with  one 
extraneous  vatiable. 

1 

Miller3Beta. lis 

Listing  file  generated  by  the  SAS  program  MillerlBeta.sas. 
Contains  the  unformatted  and  unfiltered  data  on  the  constant 
and  the  coefficients  of  regression  for  each  of  the  1920 
models  selected  using  Miller's  method  and  data  sets  with 
three  extraneous  variables. 


MillerlBeta.dat 

Generated  by  thja  FORTRAN  subroutine  Betal.sas.  Contains  the 
filtered  and  formatted  data  on  the  constant  and  the  coeffi¬ 
cients  of  regression  for  each  of  the  1920  models  selected 
using  Miller's  method  and  data  sets  with  one  extraneous 
variable . 
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Miller3Beta.dat 

Generated  by  the  FORTRAN  subroutine  Beta3.sas.  Contains  the 
filtered  and  formatted  data  on  the  constant  and  the  coeffi¬ 
cients  of  regression  for  each  “  the  1920  models  selected 
using  Miller's  method  and  data  sets  with  three  extraneous 
variables . 


MillTMl.dat 

Generated  by  the  FORTRAN  subroutine  MillTMl.for. 
the  TMSEPs  for  the  odd  design  points  (1,3,..., 63) 
extraneous  variable. 

MillTM3.dat 

Generated  by  the  FORTRAN  subroutine  MillTM3.for. 
the  TMSEPs  for  the  odd  design  points  (2, 4,..., 64) 
extraneous  variables. 


Contains 
with  one 


Contains 
with  three 


Stepwise  Analysis  using  TMSEP  for  each  method 
TM.dat 

Output  from  the  statistical  analysis  program  STATISTIX  4.0. 
Contains  the  design  point  four  columns  of  TMSEPs  (one  for 
each  method)  augmented  with  a  full  2*  factorial  design 
matrix.  This  file  is  then  used  as  input  to  the  SAS  program 
TM.sas. 

TM.lis 

Listing  file  generated  by  the  SAS  program  TM.sas.  Contains 
the  complete  analysis  f rom_ the  procedure  Stepwise.  Attempts 
a  best  fit  for  each  method's  TMSEP  as  a  linear  function  of 
the  six  factors  studied  and  their  interactions. 
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Appendix  G;  A  Glossary  of  FORTRAN  Pr  tarain  Files 


Calculating  PM  for  MSE,  SP  and  CP  methods 


FilCount.for 

PURPOSE;  Filters  the  SAS  RSquare  listings  and  generates  the 
formatted  file  Temp.dat  of  all  the  possible  model  combina¬ 
tions  for  each  of  the  3840  data  set.  Calls  Countl  and 
Counts  to  select  the  best  model. 

INPUT  DATA  FILES;  Errorl_all.lis,  Error 3_all.lis 

OUTPUT  DATA  FILES;  Temp.dat 

SUBROUTINES  CALLED;  Countl. for.  Count!. for 


Countl. for 

PURPOSE:  Selects  the  best  model  for  each  of  the  1920  data 
sets  (one  ex-raneous  variable)  from  a  file  of  all  possible 
model  combinations  for  each  set.  Uses  the  MSE,  SP,  and  CP 
methods  of  variables  selection.  Calculates  a  performance 
measure  for  each  of  the  three  groups  of  60  models  selected 
at  each  of  the  odd  design  points  (1,3, . . . ,63) . 

INPUT  DATA  FILES:  Temp.dat 

OUTPUT  DATA  FILES;  Errorl_all.dat,  PMl.dat 
SUBROUTINES  CALLED;  None 


Count! . for 

PURPOSE;  Selects  the  best  model  for  each  of  the  1920  data 
sets  (three  extraneous  variables)  from  a  file  of  all  possi¬ 
ble  model  combinations  for  each  set.  Uses  the  MSE,  SP,  and 
CP  methods  of  variables  selection.  Calculates  a  performance 
measure  for  each  of  the  three  groups  of  60  models  selected 
at  each  of  the  even  design  points  (2,4, . . . ,64) . 

INPUT  DATA  FILES;  Temp.dat 

OUTPUT  DATA  FILES:  Error3_all.dat,  PM3.dat 

SUBROUTINES  CALLED;  None 
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Calculating  PM  for  Miller's  method 


FilStepCount . for 

PURPOSE:  Filters  the  SAS  Stepwise (Forward  Selection)  list¬ 
ings  and  generates  the  formatted  file  Temp.dat  of  the  model 
selected  for  each  of  the  3840  data  sets.  Calls  Stepcountl 
and  Stepcount3  to  select  the  best  models. 

INPUT  DATA  FILES:  Stepll_all.lis,  Stepl3_all. lis, 

Step31_all.lis,  Step32_ali.lis, 
Step33_a:ll.lis,  Step34_all . lis 
Stepl_Input.dat ,  Step3_Input . dat 

OUTPUT  DATA  FILES:  Temp.dat 

SUBROUTINES  CALLED:  Stepcountl . for ,  Stepcount3 . f or 


StepCountl . for 

PURPOSE:  Implements  Miller's  method  for  each  of  the  1920 
models  (from  data  sets  with  one  extraneous  variable). 
Calculates  a  performance  measure  for  each  group  of  60  models 
selected  at  each  odd  design  point  ( 1 , 3, . . . , 63 ) . 

INPUT  DATA  FILES:  Temp.dat 

OUTPUT  DATA  FILES:  Stepl_all.dat,  PMStepl.dat 
SUBROUTINES  CALLED:  None 


StepCount3 . for 

PURPOSE:  Implements  Miller's  method  for  each  of  the  1920 
models  (from  data  sets  with  three  extraneous  variables). 
Calculates  a  performance  measure  for  each  group  of  60  models 
selected  at  each  even  design  point  (2,4, . . . ,64) . 

INPUT  DATA  FILES:  Temp.dat 

OUTPUT  DATA  FILES:  Step3_all.dat,  PMStep3.dat 
SUBROUTINES  CALLED:  None 
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P  for  MSE.  SP  and  CP  methods 


Calculating 


TMSEP.for 

PURPOSE:  Filters  the  SA3  RSquare  listings  and  generates  the 
formatted  file  Temp.dat  of  all  the  possible  model  combina¬ 
tions  for  each  of  the  3840  data  set.  Calls  TMSEPl  and 
TMSEP3  to  select  the  best  model. 

INPUT  DATA  FILES:  TMSEPl_all.lis,  TMS3P3_alI. lis 

OUTPUT  DATA  FILES:  Temp.dat 

SUBROUTINES  CALLED:  TMSEPl.dat,  TMSEP3.dat 


TMSEPl. for 

PURPOSE:  Selects  the  best  model  for  each  of  the  1920  data 
sets  (one  extraneous  variable)  fron  a  file  of  all  possible 
model  combinations  for  each  set.  Uses  the  MSE,  SP,  and  CP 
methods  of  variables  selection.  Using  each  of  the  data  sets 
with  one  extraneous  variable,  it  calculates  a  TMSEP  for  each 
of  the  three  groups  of  60  models  selected  at  each  of  the  odd 
design  points  ( 1, 3, . . . , 63 ) . 

INPUT  DATA  FILES:  Ol.dat,  03.dat, ... ,63. dat,  Temp.dat 

OUTPUT  DATA  FILES:  TMSEPl.dat 

SUBROUTINES  CALLED:  None 


TMSEP3.for 

PURPOSE:  Selects  the  best  model  for  each  of  the  1920  data 
sets  (three  extraneous  variables)  from  a  file  of  all  possi¬ 
ble  model  combinations  for  each  set.  Uses  the  MSE,  SP,  and 
CP  methods  of  variables  selection.  Using  each  of  the  data 
sets  with  three  extraneous  variables,  it  calculates  a  TMSEP 
for  each  of  the  three  groups  of  60  models  selected  at  each 
of  the  even  design  points  (2,4, . . . , 64 ) . 

INPUT  DATA  FILES:  02.dat,  04 .dat, . . . , 64 .dat ,  Temp.dat 

OUTPUT  DATA  FILES:  TMSEP3.dat 

SUBROUTINES  CALLED:  None 


96 


Calculating  TMSEP  for  Miller's  method 


MiilSAS.for 

PURPOSE:  Reads  the  3840  mofiels  (selected  by  Miller's  method) 
and  generates  SAS  code,  specific  to  each  model,  to  estiniate 
the  constant  term  and  tie  coefficients  of  regression  for 
that  model . 

INPUT  DATA  FILES:  Stf pl__all.dc.t,  Step3__all.dat 
OUTPUT  DATA  FILES:  MillerlBeta.sas,  Miller3Beta. sas 
SUBROUTINES  CALLED:  None 


MillBeta.for 

PURPOSE:  Calls  Betal  and  Beta3  and  then  calls  MillTMl  and 
MillTM3 . 

INPUT  DATA  FILES:  None 
OUTPUT  DATA  FILES:  None 

SUBROUTINES  CALLED:  Betal. for,  B©ta3.for, 

MillTMl. for,  MillTM3.for 


Betal. for 

PURPOSE:  Filters  the  unformatted  SAS  listing  file  produced 
by  the  SAS  program  MillerlBeta.sas  (from  data  sets  with  one 
extraneous  variable)  and  outputs  the  estimates  of  the  r.odel 
constant  and  regression  coefficients  in  a  sorted,  formatted 
order. 

INPUT  DATA  FILES:  Miller IBeta. lis 
OUTPUT  DATA  FILES:  MillerlBeta.dat 
SUBROUTINES  CALLED:  None 


Beta3 . for 

PURPOSE:  Filters  the  unformatted  SAS  listing  file  produced 
by  the  SAS  program  Miller3Beta.sas  (from  data  sets  with 
three  extraneous  variables)  and  outputs  the  estimates  of  the 
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model  constant  and  regression  coefficients  in  a  sorted, 
formatted  order. 

INPUT  DATA  FILES:  Miller3Beta.lis 
OUTPUT  DATA  FILES:  Miller3Beta.dat 
SUBROUTTNES  CALLED:  None 


MillTMl.for  I 

PURPOSE;  At  each  of  the  32  odd  design  points  (1,3,.. .,63,  ■ 

the  ones  with  only  one  extraneous  variable)  it  examines  each  R 

of  the  60  model  predicted  and  calculates  a  aggregated  TMSEP  B 

for  that  design  point. 

INPUT  DATA  FILES:  MillerlBeta.dat,  01 .dat ,...., 63 .dat 
OUTPUT  DATA  FILES:  MillTMl.dat 
SUBROUTINES  CALLED:  None 


MillTM3.for 

PURPOSE:  At  each  of  the  32  even  design  points  (2,4,. ..,64, 
the  ones  with  only  three  extraneous  variables)  it  examines 
each  of  the  60  model  predicted  and  calculates  a  aggregated 
TMSEP  for  that  design  point. 

INPUT  DATA  FILES:  Miller3Beta.dat,  02.dat,  04.dat,.- 
. • , 64 . dat 

OUTPUT  DATA  FILES:  MillTM3.dat 
SUBROUTINES  CALLED;  None 


BARR. FOR 

PURPOSE;  Written  to  read  and  correct  most  errors  found  in 
Hansen's  data  files.  It  was  written  by  Dr.  David  Barr.  It 
scans  the  data  file  after  correction  and  outputs  certein 
data  characteristics  for  verification.  Some  errors  had  to 
be  corrected  by  hand,  but  this  program  will  allow  the  exper¬ 
imenter  to  be  absolutely  certain  about  the  data's  current 
characteristics . 


I 
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Appendix  H;  A  Glossary  of  SAS  Program  Files 


Calculating  PM  for  MSB.  SP  and  CP  methods 
Error 1  all.sas 

Reads  TTl.dat,03.dat, . . . ,63.dat  by  set  and  use  the  RSquared 
procedure  to  generate  all-possible  models  for  each  set  of  10 
or  20.  MSE,  SP,  and  CP  statistics  are  calculated  for  each 
model.  The  listing  file  Error l_all.li8  is  output. 

Error3  all.sas 

Reads  U2 .dat , 04 .dat , . . . , 64 .dat  by  set  and  use  the  RSquared 
procedure  to  generate  all-possible  models  for  each  set  of  10 
or  20.  MSS,  SP,  and  CP  statistics  are  calculated  for  each 
model.  The  listing  file  Error3_all.lis  is  output. 


Calculating  PM  for  Miller's  method 
Stepll  all.sas 

Reads  'S'l  .dat,  03  .dat, . . . ,  31  .dat  by  set  and  augments  each  set 
with  four  known  random  predictors.  The  Stepwise  procedure 
is  than  run  and  one  model  is  chosen  for  each  set.  The 
listing  Stepll_all.lis  is  generated. 

Stepl3  all.sas 

Reads  ?3.dat,35.dat, . . . ,63.dat  by  set  and  augments  each  set 
with  four  known  random  predictors.  The  Stepwise  procedure 
is  than  run  and  one  model  is  chosen  for  each  set.  The 
listing  Stepl3_all.lis  is  generated. 

Step31  all.sas 

Reads  1^2 .dat , 04 .dat , . . . , iS.dat  by  set  and  augments  each  set 
with  six  known  random  predictors.  The  Stepwise  procedure  is 
than  run  and  one  model  is  chosen  for  each  set.  The  listing 
Step31_all . lis  is  generated. 

Step32  all.sas 

Reads  T8.dat,20.dat, . . . ,32.dat  by  set  and  augments  each  set 
with  six  known  random  predictors.  The  Stepwise  procedure  is 
than  run  and  one  model  is  chosen  for  each  set.  The  listing 
Step32_all.lis  is  generated. 

Step33  all.sas 

Reads  74  .dat, 36.dat, ... ,48. dat  by  set  and  augments  each  set 
with  six  known  random  predictors.  The  Stepwise  procedure  is 
than  run  and  one  model  is  chosen  for  each  set.  The  listing 
Step33__all.lis  is  generated. 


Step34_all . sas 

Reads  50 .dat, 52 .dat , . . . , 64 .dat  by  set  and  augments  each  set 
with  six  known  random. predictors.  The  Stepwise  procedure  is 
than  run  and  one  model  is  chosen  for  each  set.  The  listing 
Step34_all . lis  is  generated. 


Stepwise  analysis  using  PM  for  each  method 
PM. sas 

Reads  PM.dat  and  performs  four  separate  Stepwise  regres¬ 
sions.  Each  regression  considers  a  different  dependent 
variable  but  the  uses  the  same  independent  variables. 
Generates  listing  file  PM. lis. 


Calculating  TMSEP  for  MSE,  SP  and  CP  methods 
TMSEPl_all.sa3 

Reads  01 .dat, 03 .dat , . . . / 63 .dat  by  set  and  usa  the  RSquared 
procedure  to  generate  all-possible  models  for  each  set  of  10 
or  20.  MSE,  SP,  and  CP  statistics  and  the  coefficients  of 
regression  are  calculated  for  each  model.  The  listing  file 
TMSEPl_all.lis  is  output. 

TMSEP3  all. sas 

Reads  ?2  .dat , 04 .dat , . . . , 64 .dat  by  set  and  use  the  RSquared 
procedure  to  generate  all-possible  models  for  each  set  of  10 
or  20.  MSE,  SP,  and  CP  statistics  and  the  coefficients  of 
regression  are  calculated  for  each  model.  The  listing  file 
TMSEP3_all . lis  is  output. 


Calculating  TMSEP  for  Miller's  method 
Miller iBeta . sas 

Reads  Ol.dat,  03 .dat , . . . , 63 .dat  by  set  and  uses  the  RSquared 
procedure  (with  various  switches)  to  calculate  the  coeffi¬ 
cients  of  regression  for  only  the  model  selected  for  each 
data  set  by  Miller's  method.  The  listing  file  MillerlBeta- 
.lis  is  generated. 

Miller3Beta . sas 

Reads  02.dat,  04 .dat , . . . , 64 .dat  by  set  and  uses  the  RSquared 
procedure  (with  various  switches)  to  calculate  the  coeffi¬ 
cients  of  regression  for  only  the  model  selected  for  each 
data  set  by  Miller's  method.  The  listing  file  Miller3Beta- 
.lis  is  generated. 
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Stepwise  analysis  using  TMSEP  for  each  method 
TM.sas 

Reads  TM.dat  and  performs  four  separate  Stepwise  regres 
sions.  Each  regression  considers  a  different  dependent 
variable  but  the  uses  the  seune  independent  variables. 
Generates  listing  file  TM.lis. 
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Appendix  I:  FORTRAN  Programs 


List  of  FORTRAN  Programs 


BARR. FOR  .  .  . 
BETAl.FOR  .  .  . 
BETA3.FOR  .  .  . 
COUNTl.FOR  .  . 
C0UNT3.F0R  .  . 
FILCOUNT.FOR  . 
FILSTEPCOUJTf.FOR 
MILLBETA.FOR  . 
MILLSAS.FOR  .  . 
MILLTMl.FOR  .  . 
MILLTM3.FOR  . 
STEPCOUNTl.FOR 
STEPCOUNT3.FOR 
TMSEP.FOR  .  .  .  . 
TMSEPl.FOR  .  .  , 
TMSEP3.FOR  .  .  . 
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117 

122 

128 

131 

136 

137 
148 
153 
158 
164 
170 
173 
179 
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******************************1****************************** 

*  BARR. FOR 

* 

*  This  program  reads  in  Hansen's  data  files  and  scans  them 

*  for  certain  data  characteristics.  These  are  output  for 

*  verification. 

*♦★*★★★*★****★★♦*★**★★*★*★♦**★★***♦★*★***★♦★*★★**♦♦★*****★*★ 
real  x(4) ,ex(3) 

integer  n,set,count,ind(6) ,inum,nex,ss,nexref (2) 
integer  lecount,hecount,lxcount,hxcount 
double  precision 

+  errorsum,error2sum,lerrorsum,lerror2sum 

double  precision  exsum( 3) ,ex2sum( 3 ) ,xsum(4 ) ,x2sum(4 ) 
double  precision  herrorsum,herror2sum,lexsum,lex2sum 
double  precision 

+  hexsum, hex2sum, Ixsum, lx2sum, hxsum, hx2sum 
double  precision 

+  xprod(4,4) ,lxsum,lx2sum,hxsum,hx2sum, 

+  error 

character*6  nauneof  file(64 ) 

nameoffile( l)a'01.dat' 
nameof file( 2 )=' 02.dat' 
nameoffile{ 3)=' 03.dat' 
nameof file (4)=' 04.dat' 
nameof file (5 )=' 05.dat' 
nameof  file  ( 6  )=»' 06  .dat ' 
nameof file ( 7 ) = ' 07 . dat ' 
nauneof  f  ile  { 8 )  = '  08 .  dat ' 
nameof  file  ( 9 )  =' '  09.dat ' 
nameof file ( 10 )=' 10 .dat ' 
nameoffilej 11)=' ll.dat' 
neuneof  f  ile(  12  )  =  '  12  .dat' 
hauneoffile(  13)  =  ' 13.dat ' 
nameof file( 14 )=' 14. dat' 
nauneoffile(  15)  =  '15. dat' 
nauneof  file  (16)  =  '  16.dat ' 
nauneof  file(  17  )  =  '  17.dat' 
nauneof  file(  18  )  =  '  18  .dat' 
nameoffile( 19)='19.dat' 
nauneof  file  (20)  =  ' 20.dat' 
nauneof  file(21)  =  '21.  dat' 
nauneof  file  (22  )  =  ' 22. dat' 
nauneof  file(23)  =  '23. dat' 
nauneof  file{  24  )  =  ' 24  .dat' 
nauneof  file(  25  )  =  '25. dat' 
nauneoffile(26  )»'26.dat' 
nauneof  file  (27)  =  ' 27. dat' 
nauneof  file  ( 28 )  =  '28.dat ' 
nameof file (29)=' 29.dat' 
nauneoffile(30)  =  '30.dat' 
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naineoffile(31) 
naineoffile(32] 
nciineoffile{33 ) 
n£iineoffile(34 ) 
naineoffile(35) 
ncuneof  file(  36 ) 
nameof f ile( 37 ) 
nameof f lie ( 38 ) 
neuneof  f  lie  ( 39 ) 
naineoffile(40) 
neuneofflle(41) 
nameof  fils  (42  )> 
n£uneoffile(43) 
nameof file (44 ) 
nameof file (45 ) 
ncimeof  file(46 ) 
nameof file{47 ) 
ncimeoffile(48} 
nameoffile(49) 
nameof file ( 50 ) 
nameof file(51 ) 
nameoffile(52 ) 
nameof file (53 ) 
n2uneoffile(54 ) 
nauneoffile(55  )  = 
nameoffile(56)s 
nameoffile(57)= 
nauneoffile(58)  = 
nameoffile(59)= 
neuneof  f  ile  ( 60 )  = 
n6aneoffile(61 ): 
nameoffile(62 )= 
ncimeof  file  ( 63 )  = 
neimeoffile(64  )<: 
nexref ( 1 )=! 
nexref (2 )=3 


'31.dat' 
'32.dat' 
'33.dat' 
'  34.dat' 
'35.dat' 
'36.dat' 
'37.dat' 
'38.dat' 
'39.dat' 
'40.dat' 
'41.dat' 
'42.dat' 
'43.dat' 
'44.dat' 
'45.dat' 
'46.dat' 
'47.dat.' 
'48.dat' 
'49.dat' 
'50.dat' 
'51.dat' 
'52.dat' 
'53.dat' 
'54.dat' 
'55.dat' 
'56.dat' 
'57.dat' 
'58.dat' 
'59.dat' 
'60.dat' 
'61.dat' 
'62.dat' 
'63.dat ' 
'64.dat' 


1000  format(5x,4 (f 15.5, lx) ) 

open(unit=8,file*='ivl.out' , status® 'new' } 
open ( unit=9 . file® ' ivO . out ' , status® ' new ' ) 
open (unit=ll,file='dbarr. out' , status® 'new' ) 
open(unit=12,file“'dbarr.log' , status-'new' ) 
k=64 

ind ( 3 ) ®0 

ind { 4 ) =0 

ind(5)=0 

ind(6)®0 

count®0 

lxcount®0 

hxcount®0 

lecount=0 


hecount=0 

lxsum=0 

lx2suin=0 

hxsuin=0 

hx2suni=0 

lerrorsuni=0 

lerror2sum=0 

herrorsmn=0 

herror2sum=0 

1i1i1t*1t***it*it**1i******ie1t1Ht1i**1t1t1i**1t*1t1r*it1t1t***1t1t**it***1t******** 

do  10  inum=l,k 

errorsum=0 

error2suin=0 

do  11  ill=l,3 

exsum(lll)-0 

ex28um(ili)=0 

11  continue 

do  12  il2=l,4 
xsuin(il2  )=0 
x2suin{il2)®0 

12  continue 

do  13  il3=l,4 
do  14  il4=l,4 
xprod(il3,il4 )=0 
14  continue 

13  continue 

print  *,  nsuneof file(inuin) 
write (11,*)  nameof file(inuin) 

write(12,*)  nameoffile(inuin)  .  . 

ind(l)«inuin+l-2*(  (inum+l)/2) 
ind(2)*iabs(2*(  (inuin+3)/4)-(  (inuin+l)/2)“l) 
ind(3)=iabs(2*(  (inuin+7)/8)-(  (inum+3)/4)-l) 
ind(4)=*iabs(2*{  {inum+15)/16)-(  (inuin+7)/8)-l) 
ind(5)“iabs(2* { {inuin+31)/32)-(  (inuni+15)/i6)-l) 
ind  ( 6 )  =iabs  ( 2 ♦  ( ( inuin+63 )  /64 )  -  ( ( inum+3 1 )  /32 )  -1 ) 
nex-nexref ( ind ( 1 ) +1 ) 

open(unita:10,file=naineoffile(inu]n)  ,status='old' ) 

if  (ind(5) .eq.O)  then 
ss^ilO 
else 
88*20 
endif 
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n=ss*60 
do  50  h=  l,n 

read  (10,*)  set,y,  (x(i) ,i=l,4 ) ,  {ex(i) ,i=l,nex) 

count =count+l 
if (ind(6) .eq.O)  then 
iecount=lecount+l 
else 

hecount=hecount+l 

endif 

if(nex.eq.l)  then 
ex(2)=0 
ex ( 3 ) =0 
endif 


call  errorcomp(x,y, error, error sum, err or 2 sum) 
call  extra(nex, ex, exsum, ex2sum) 
call  xi(x,xsum,x2sum,xprod) 

write(ll,*)  set,y,error, (x(i) ,i=l,4) , (ex(i) ,i=l,3) , 
+  (ind(7-i) ,i=i,6) 

50  continue 

call 

+  endpr  int  ( n ,  iiid ,  nex ,  s  s ,  error  sum ,  err  or  2  sum ,  lerr  or  sum , 

+  lerror2  sum, exsum, ex2  sum, xsum, x2  sum, xprod , herrorsum, 

+  herror2sum, 

+  Ixsum, lx2sum,hxsum,hx2sum,lxcount, 
t  hxcount,n£uneoffile,inizm) 

10  continue 


print*,  '  ' 
write (12,*)  '  ' 

print  * ,  ' number  of  observations  =  ' , count 
write (12,*)  'number  of  observations  =»  ', count 

print  *,  'small  independent  variance  =  ', 

+  lx2sum/(lxcount)-(lxsum/{lxcount) )**2 
print  * ,  ' large  independent  variance  «  ' , 

+  hx2sum/(lxcount )-(hxsum/(lxcount) )**2 

write (12,*)  'small  independent  variance  «■  ', 

+  1x2 sum/ ( Ixcount ) - ( Ixsum/ ( Ixcount ) ) **2 

write (12,*)  'large  independent  variance  =  ', 

+  hx2sum/(hxcount)-(hxsum/(hxcount) )**2 
print  *,  'small  error  variance  »  ', 

+  lerror2sum/lecount- ( lerrorsum/lecount ) **2 
write (12,*)  'small  error  variance  =  ', 

+  lerror2sum/lecount-( lerrorsum/lecount)* *2 
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print  * ,  ' large  error  variance  =  ' , 

+  herror2suin/hecount-(herrorsuin/hecount )  **2 
write (12,*)  'large  error  variance  = 
t  herror 2  sum/ hecount- ( herror sum/hecount ) *  *  2 

END 

Itlililtltliltlfttlile*-^*********************************************** 

subroutine  errorcomp(x,y,c''ror,  error  sum,  error  2  sum) 

double  precision  error, error sum, err or 2 sum 
real  x(4) 

yactual-0 

do  60  p=l,4 

yactual  =  yactual+x(p) 

60  continue 

er r or »y-y ac t ua 1 

errordum=errorsum+error 

error 2 sum=error 2 sum+error *error 

return 
END  I 

************************************************************ 

subroutine  endpr int ( a , ind , nex , s s , err or sum , error 2  sum , 

4-  lerrcirsum,  lerror2 sum,  exsum,  ex2 sum,  xsum,  x2sum,  xprod , 

+  herroirsum,  herror  2  sum, 

+  Ixsuni,  lx2sum,  hxsum,  hx2sum,  Ixcount , 

+  hxco\:jnt,nameo££ile,inum) 

integjer  n,ind(6)  ,nex,ss,lxcount,hxcourtt,inum 
double  precision  error8um,error2sum,  lerrorstim, 

4-  lerr(.jr2sum,exsum(3)  ,ex2sum(3)  ,xsum(4  )  ,x2sum(4 ) , 

4-  xprod(4,4)  ,r(4,4)  ,v(4)  ,herrorsum,herror2sum.,ev(3) , 

+  Ixsum, lx2sum, hxsum, hx2sum, rsum, exl , ex2 
character*6  nameoffile(64 ) 

1000  £ormat(5x,4(£15.5,lx) ) 

1010  £ormat(5x,4 (ilO, lx) ) 

print  *,  (ind(7-i) ,i=l,6) 
write (12,*)  (ind(7-i) ,i=i,6) 

print  *,  '  ',ind(l),'  there  are  ',nex,'  extraneous 
4-  variables' 

write (12,*)  '  ',ind(l),'  there  are  ',nex,'  extraneous 
4-  variables' 

do  50  k-1,4 

v(k)«x2sum(k)/n-(xsum(k)/n)**2 
50  continue 
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rsuin=0 
do  30  i=l,4 
do  40  j=l,i 

rn=xprod ( i , j ) /n- ( xsum ( i ) /n ) * ( xsum ( j ) /n ) 
r(i, j)=rn/sqrt(v(i)*v( j) ) 
r( j,i)=r(i, j) 
if(i.ne.j)  then 
r  sum=rsuin+r  ( i ,  j  ) 
endif 

40  continue 

30  continue 

print  *e  '  '/ind(2),'  correlation  ',rsuin/6 
write (12,*)  '  ',ind(2),'  correlation  ',rsuin/6 

do  35  i=l,4 

print  *,  (r(i, j), j=l,4) 
write (12, 1000)  (r ( i, j ) , j=l,4 ) 

35  continue 


exl=0 

ex2=0 

do  10  j==l,  nex 

ev(  j  )  =ex2sum(  j  )  /n- ( exsuin(  j )  /n )  **2 
exl=exl+exsuin(  j ) 
ex2=ex2+ex2suin(  j ) 

10  continue 

ve«(e::2/(n*nox)  )-(exl/ (n*nex)  )**2 

print  *,  '  ',ind(3),'  variarce  of  extraneous  ',ve 

write (12,*)  '  ' , ind ( 3 ) , '  variance  of  extraneous  ' , ve 

print  *,  (ev(k) ,k=l,nex) 

write ( 12, 1000)  (ev(k) ,k=l,nex) 

write ( 12, 1000)  (exsum(i) ,i=l,nex) 

write ( 12, 1000  )  (ex2suin(i)  ,i=l,nex) 

write ( 12, 1010)  n,n*nex 

xls»xsuin(  1 ) +xsuin(  2 ) +xsuin(  3 ) +xsun(  4 ) 
x2=x2sum(  1  )+x2sum(2  )+x2suin(3)+x2sun(4  ) 
vx=x2/(4*n)-(xi/(4*n) )**2 

print  *,  '  ',  ind (4),'  variance  of  independent  ',vx 
write (12,*)  '  ',  ind (4),'  variance  of  independent  ',vx 

if (vx.lt. .00125)  then 
lxsuin=lxsuin+xl 
1x2  suin=lx2  suin+x2 
lxcount=lxcount+4*n 
write ( 9 ,  * )  naineof file  ( inum) ,  ind ( 4 ) , 

+  x2/(4*n)-(xl/(4*n) )**2 
else 
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hxsuin=hxsum+xl 
hx2  suiR=:hx2  smn+x2 
hxcount =hxcount+4  *n 
write  ( 8 ,  * )  neuneof  f ile  ( inum) ,  ind  { 4 ) , 

+  x2/(4*n)-(xl/(4*n) )**2 
endif 

print  *,  '  ',ind(5),'  the  sample  size  is  ',ss 
write (12,*)  '  ',ind(5),'  the  sample  size  is  ',ss 

print  * ,  '  ' , ind ( 6 ) , '  error  variance  = ' , 

+  error2sum/n-(errorsum/n)**2 

write (12,*)  '  ',ind(6),'  error  variance  =' , 

+  error2sum/n-(errorsiim/n)**2 

if (ind(6) .eq.O)  then 
lerrorsiim=lerroraom+errorsum 
lerr  or  2  sum=lerror  2  s’lm+error  2  sum 
else 

herrorsum=herrorsum+errorsum 
herror 2 sum=herror 2 sum+error2 sum 
endif 

close  (unit=10) 

return 

END 

************************************************************* 

subroutine  extra (nex, ex, exsum, ex2 sum) 
integer  nex 
real  ex(3) 

double  precision  exsum(3) ,ex2sum(3) 

do  10  i=l,nex 

exsum ( i ) =exsum ( i ) +ex ( i) 

ex2sum(i)“ex2sum^)+ex(i)*ex(i) 

10  continue 
return 

END 

subroutine  xi(x,xsiim,x2sum,xprod) 
real  x ( 4 ) 

double  precision  xsum(4 ) ,x2sum(4 ) ,xprod(4 ,4 ) 

do  10  i=l,4 

xsum( i ) =xsum ( i ) +x ( i ) 

do  20  j=l,4 

xprod(i, j)=xprod(i, j)+x(i)*x{ j) 

20  continue 

10  continue 

do  30  k=l,4 

x2 sum ( k ) »xpr od ( k , k ) 

30  continue 
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******************  fortran  program  betai.for**************** 


SUBROUTINE  BETAl 

INTEGER  VARNUM , MODELNUM, J , K , L , N , P , TOTALLINES , CHARPOS 
REAL  R2,  BO,  BETA(4),  SORTED_BETAS { 4 ) 

CHARACTER* 132  TINE 
CHARACTER*2  MODEL(4) 

TOTALLINES=0 

OPEN  (unit*10,  file^'roillerlbeta.lis' ,status='OLD' , 

4-  iostat=IERROR,err=1500 ) 

OPEN  {unit“ll,  file='inillerlbeta.dat '  ,status*'NEW' , 

+  iostat“IERROR,err=1500 ) 


5  CONTINUE 

READ  (10, 900, END-888)  LINE 
900  FORMAT  (A132) 

DO  10  J-1,132 

IF  (LINE(J:J).EQ.'I')  THEN 
CHARPOS  -  J 
GO  TO  20 
ENDIF 

10  CONTINUE 

GO  TO  5 

20  CONTINUE 

DO  35  L-1,4 

SORTED_BETAS(L)=0.0  1 
BETA(L)-0.0 
35  CONTINUE 


IP  ( {LINE( (CHARPOS+1) : (CHARPOS+l) ) ) .EQ. 'N' )  THEN 
READ  (10,*)  1 

READ  (10,*, END-1300)  VARNUM,  R2 ,  BO 
VARNUM  »  VARNUM- 1  J 

IF  (VARNUM.GT.O)  GO  TO  1200 
WRITE  (11,902)  VARNUM,  BO 
902  FORMAT 

+(1X,I1,5X,F9.5,7X, ' 0. 00000 ',7X, '0.00000' ,7X, '0.00000' , 
+  7X, '0.00000' ) 

TOTALLINES-TOTALLlNES+1 


ELSE 

IF  ((LINE( (CHARPOS+1); (CHARPOS+1) )).EQ.'n')  THEN 
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K»=l 

DO  30  J=CHARPOS,132 

IF( ( (LINE( J:J) ) .EQ. 'X' ) .OR. ( (LINE( J: J) ) .EQ, 'E' ) ) 
THEN 

MODEL(K)  =  LINE( J: ( J+1) ) 

K  =  K+1 
ENDIF 

30  CONTINUE 

MODELNUM  =»  K-1 

IF  (MODELNUM.lt. 1)  GO  TO  1100 
READ  (10,*) 

VARNUM=0 

READ  (10,*,END*1300)VARNUM,  R2 , 

+  BO, (BETA(N),N=1,VARNUM) 

IF  (VARNUM.NE. MODELNUM)  GO  TO  1000 


DO  40  P=1,VARNUM 

IF  (MODEL (P ) .EQ. 'XI' )  SORTED_BETAS ( 1 ) =BETA(P ) 

IF  (MODEL(P) .EQ. 'X2' )  SORTED_BETAS ( 2 ) =BETA(P ) 

IF  (MODEL(P) .EQ. 'X3' )  SORTED_BETAS ( 3 ) =BETA ( P ) 

IF  (MODEL(P) .EQ. 'El' )  SORTED_BETAS(4 )=BETA(P) 

40  CONTINUE 

WRITE  (11,901)  VARNUM,  BO, (SORTED  BETAS(N) ,N=1,4 ) 
901  FORMAT(lX,Il,5X,F9.5,5X,F9.5,5X,F9.5,5X,F9.5,5X,F9.5) 

TOTALLINES  =  TOTALLINES+1 
ENDIF 
ENDIF 
GO  TO  5 

888  CONTINUE 

CLOSE (10) 

CLOSE (11) 

PRINT  *,  'FILTERING  OF  MILLERlBETA.LIS  IS  COMPLETE.' 
PRINT  *,  TOTALLINES,'  LINES  WRITTEN  TO 
+  MILLER1BETA.DAT.' 

PRINT  * , '  ' 

GO  TO  1600 

1000  CONTINUE 

PRINT  *,' Unexpected  file  format!', 

+  '  #  of  variable  names  does  not', 

+  'correspond  to  #  of  varibles  read. ' 

GO  TO  1600 

1100  CONTINUE 

PRINT  *, 'Unexpected  file  format!', 

+  '  Could  not  find  XI,  X2,  X3,  or  El.' 

GO  TO  1600 


1200  CONTINUE 


PRINT  Unexpected  file  format I  Expecting  ONLY  BO 
GO  TO  1600  . 

1300  CONTINUE 

PRINT  *  > ' Unexpected  file  format  1  Encountered  EOF 
+  while 

+  'attempting  to  read  VARNUM,  R2,  BO,  and/or 

+  Betas. ' 

GO  TO  1600 

1500  CONTINUE 

PRINT  1501,'+++  ERROR  WHILE  OPENING  FILE  +++', 

+  '  error  code  =  ' , lERROR 

1501  FORMAT  (/IX,  A/  IX,  A,  18/) 

1600  CONTINUE 
END 
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★  *  ♦  ititirltieifit'kk'kif 

*****************  FORTRAN  PROGRAM  BETA3 .FOR***************** 

★  ★★★★★★^^★★★★★★★★★★★★★★★★★★★♦★♦★★★★★★lilr*********************** 


SUBROUTINE  BETA3 


INTEGER  VARNUM , MODELNUM, J , K , L , N , P , TOTALLINES , CHARPOS 
REAL  R2 ,  BO ,  BETA ( 6 ) ,  SORTED_BETAS ( 6 ) 

CHARACTER* 132  LINE 
CHARACTER* 2  MODEL(6) 


TOTALLINES=0 


OPEN  (unit=12, 

+ 

OPEN  (unit=13, 

+ 


file='miller3beta.lis' , status® 'OLD' , 
ios t at =IERROR , er r= 1500) 
file='miller3beta.dat ' , status='NEW' , 
iostat=IERROR , err® 1500) 


5  CONTINUE 

READ  (12,900,END®888)  LINE 
900  FORMAT  (A132) 

DO  10  J=l,132 

IF  (LINE(J:J) .EQ. 'I' )  THEN 
CHARPOS  ®  J 
GO  TO  20 
ENDIF 


10 

CONTINUE 
GO  TO  5 

20 

CONTINUE 

DO  35  L=l,6 

SORTED_BETAS ( L ) ®0 . 0 
BETA(L)=0.0 
35  CONTINUE 


IF  ( (LINE( (CHARPOS+1) ; (CHARPOS+1) ) ) .EQ. 'N' )  THEN 
READ  (12,*) 

READ  (12,*, END® 1300)  VARNUM,  R2,  BO 
VARNUM  =  VARNUM- 1 
IF  (VARNUM.GT.O)  GO  TO  1200 
WRITE  (13,902)  VARNUM,  BO 

902  FORMAT  ( IX, II , 5X,F9 .5, 7X, '0. 00000 ', 7X, ' 0.00000 ' , 

+  7X, ' 0. 00000 ',7X, '0.00000' ,7X, '0.00000' , 

+  7X, '0.00000' ) 

TOTALLINES“TOTALLINES+ 1 

ELSE 
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IP  ((LINE((CHARPOS+l):(CHARPOS+I))).EQ.'n')  THEN 
K=1 

DO  30  J=CHARPOS,132. 

IF  ( ( (LINE( J: J) ) .EQ. 'X' ) .OR. ( (LINE( J: J) ) .EQ. 'E' ) ) 
THEN 

MODEL{K)  =  LINE(J: (J+1) ) 

K  »  K+1 
ENDIF 

30  CONTINUE 

MODELNUM  =  K-1 

IF  (MODELNUM.lt. 1)  GO  TO  1100 
READ  (12,*) 

VARNUM=0 

READ  (12,*,END=1300)VARNUM,  R2, 

+  BO, (BETA(N) ,N=1,VARNUM) 

IF  (VARNUM.NE. MODELNUM)  GO  TO  1000 


40 


901 


DO  40  P'=1,VARNUM 

IF  (MODEL(P).EQ.'Xl') 
IF  (MODEL(P) .EQ. 'X2' ) 
IP  (MODEL(P) .EQ. 'X3' ) 
IF  (MODEL(P) .EQ. 'El' ) 
IF  (MODEL(P) .EQ. 'E2' ) 
IP  (MODEL(P).EQ.'E3') 
CONTINUE 


SORTED_BETAS ( 1 ) =BETA ( P ) 
SORTED  BETAS(2)=BETA(P) 
S0RTED“BETAS ( 3 ) =BETA ( P ) 
S0RTED“BETAS ( 4 ) »BETA( P ) 
SORTED_BETAS ( 5 ) »BETA ( P ) 
S0RTED_BETAS ( 6 ) =BETA ( P ) 


WRITE  (13,901)  VARNUM,  BO, 

+  (SORTED  BETAS(N) ,N=1,6) 

FORMAT  T1X,I1,5X,F9.5,5X,F9.5,5X,F9.5,5X,F9.5,5X, 
+  F9.5,5X,F9.5,5X,P9.5) 

TOTALLINES  =  TOTALLINES+1 
ENDIF 
ENDIF 
GO  TO  5 


888  CONTINUE 
CLOSE (12) 

CLOSE  (13.) 

PRINT  *,  'FILTERING  OF  MILLER3BETA.LIS  IS  COMPLETE.' 
PRINT  *,  TOTALLINES,'  LINES  WRITTEN  TO 
+  MILLER3BETA.DAT.' 

PRINT  *,'  ' 

GO  TO  1600 


1000  CONTINUE 

PRINT  *, 'Unexpected  file  formatl', 

+  '  ,#  of  variable  naunes  does  not', 

+  'correspond  to  #  of  varibles  read.' 

GO  TO  1600 


1100 


CONTINUE 


A 
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PRINT  *, 'Unexpected  file  formatl', 

+  '  Could  not  find  XI,  X2,  X3,  El,  E2,  E3.' 

GO  TO  1600 

1200  CONTINUE 

PRINT  *, 'Unexpected  file  formatl  Expecting  ONLY  BO.' 
GO  TO  1600 
1300  CONTINUE 

PRINT  *,' Unexpected  file  formatl  Encountered  EOF 
+  while  ' , 

+  'attempting  to  read  VARNUM,  R2,  BO,  and/or 

+  Betas . ' 

GO  TO  1600 

1500  CONTINUE 

PRINT  1501,'+++  ERROR  WHILE  OPENING  FILE  +++', 

+  '  error  code  »  ' , lERROR 

1501  FORMAT  (/IX,  A/  IX,  A,  18/) 

1600  CONTINUE 
END 
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***********************  fortran  program  COUNTl.FOR********** 

SUBROUTINE  Count 1  (NewOut) 

integer  nuin(  15)  ,1,  j,k,ptrinse,ptrsp 
integer  ptrcp,varsmse,varssp,var8cp 
integer  check(4) 

integer  n,emse,esp,ecp,DesignPoint 

integer  ccp ,  cmse ,  csp ,  cumeinse ,  cumecp ,  cumesp 

integer  chartinse(0: 3,0:3)  ,chartcp( 0:3, 0:3) 

integer  chartsp(0:3,0:3) 

real  MSE( 15 ) ,Sp( 15 ) ,cp( 15 ) ,r2 ( IF ) 

real  mininse ,  minsp ,  mincp ,  nummse ,  numcp ,  numsp 

real  inseeer,cpeer,speer 

real  msepm,cppm,sppm 

character*2  m(4,15) 

character*20  NewOut  . 

check ( 1 )=1 

check(2)=5 

check(3)=ll 

check(4)=15 

DesignPoint=l 

do  7  i  »  0,3 
do  3  k  =  0,3 

chartrose  ( i ,  k )  ==0 
chartcp(i,k)=0 
chartsp(i,k)=0 
3  continue 

7  continue 

varsmse=0 
A’arssp  =0 

varscp  =0  - ^ - - - ' . . 

cumeinse=0 

cumesps^O 

cumecp=0 

open  (unit=ll,  file='temp.dat ' ,  status='old' , 

+  ios tat “TERROR,  err“1000) 

open  (unit=12,  file=NewOut,  status“'new' , 

+  iostat=IERROR,  err“1000) 

open  (unit“13,  file-'PMl.dat' ,  statu5“'new' , 

+  io8tat“IERR0R,  err=1000) 

write ( 13 , * ) '  DESIGNPOINT  MSE  ' , 

+  'SP  CP' 


Do  50  jj=l,63,2 
Write(12,*) ' 
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Write{12,*)'  ' 

Write (12,*)'  ' 

Write(12,*)  *********************  DESIGN  POINT 
+  DesignPoint, '  ★*******************' 

Write  (12,*)'  ' 

DesignPoint=DesignPoint  +2 

do  20  k=l,60 
do  10  i=l,15 

emse^O 

espsO 

ecp=0 

read(ll, ' (IX, II) ' ,end=40)num(i) 


IF  (nuin(i)  .EQ.l)  THEN 

+  read(  ll,905,end=40  )nuin(i)  ,r2  (i)  ,cp(i)  ,MSE(i)  ,Sp(i) 
+  fni(l,i) 

905  fonnat(7X,Il,4X,Fl0.8,3X,F9.5,3X,F9.7,2X, 

+  F10.8,2X,A2) 

EXiSE 

IF  (nuin(i)  .EQ.2)  THEN 

+  read( ll,910,end=4C)nuin(i) ,r2 (i) ,cp(i) ,MSE(i) ,Sp(i) 
+  ,  (in(  j,i) ,  j=^l,2) 

910  fonnat(7X,Il,4X,F10.8,3X,F9.5,3X,F9.7,2X, 

+  F10.8,1X,2(1X,A2) ) 

ELSE 

IF  (nuin(i)  .EQ.3)  THEN 

+  read(  ll,915,end=40)nuin(i)  ,r2(i)  ,cp(i)  ,MSE(i)  ,Sp(i) 
+  ,  (in(  j,i) ,  j=l,3) 

915  fonnat(7X,Il,4X,Fl0.8,3X,F9.5,3X,F9.7,2X, 

+  F10.8,1X,3(1X,A2) ) 

ELSE 


+ 

+ 


920  ! 


IF  (num(i) .EQ.4)  THEN 

read(  11 , 920,end=40  )nuin(i)  ,r2  (i)  ,cp(i)  ,MSE(i)  ,Sp(i) 
f  {Jn(  j,i) ,  j=l,4) 

fonnat(7X,Il,4X,Fl0.8,3X,F9.5,3X,F9.7,2X, 
F10.8,1X,4(1X,A2) ) 

ELSE 

Print  *,  'Number  of  variables  not  found;', 
'input  file  in  wrong  format I ' 

ENDIF 

ENDIF 

ENDIF 

ENDIF 


minmse-10000 
minsp  =10000 
mincp  =10000 
pti:mse=0 
ptrcp  =0 
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ptrsp  =0 
do  30  j=  1,4 

if  (inse(check(  j) )  .It.mirunse)  then 
minmse=inse  ( check  ( j  ) ) 
pt rmse^check ( j ) 
endif 

if (sp(check( j) ) .It.minsp)  then 
minsp=sp ( check ( j ) ) 
ptrsp=check( j) 
endif 

if (cp(check( j) ) .It.mincp)  then 
inincp=cp  ( check  ( j ) ) 
ptrcp=check ( j ) 
endif 


30 

continue 

10 

continue 

40 

continue 

varsmse«=varsmse+nuin(ptnnse ) 
varssp  =varssp  +nuni(ptrsp) 
varscp  *=varscp  +nuin(ptrcp) 


do  70  n'=l,nuin(ptnnse) 

if {m(n,ptnnse) .EQ  'El')  then 

emse=einse+l 

endif 

70  continue 

do  80  n=]  ,nuin( ptrsp) 

if  (in(n,ptrsp)  .eq. 'El' )  then 

esp=ssp+l 

endif 

80  continue 

do  90  n=l,nmn(ptrcp) 

if  (in(n,ptrcp)  .eq. 'El' )  then 

ecp=ecp+l 

endif 

90  continue 

cuineinse®cumemse+emse 
cumesp=cumesp+esp 
cuinecp“cuinecp+ecp 
cmsesnum ( ptrmse ) -emse 
ccp=num ( pt rep ) -ecp 
csp“nuin(  ptrsp ) -esp 

chartmse ( emse , emse ) =chartmse ( emse , emse ) tl 
chartep ( cep , ecp ) =chartcp ( cep , ecp ) +1 
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+++  ++++  ++++  ++++++++ 


T 


chart sp ( csp , esp ) =chartsp ( csp , esp ) +1 


write ( 12 ,  *  )  'MSB'  ,num(ptnnse)  ,inse(ptznnse) 

,  Sp  ( ptnnse )  /  cp  ( ptmtse ) , '  ' , 

(mj  j , ptnnse) ,  j=l,nuin( ptnnse)  ) 
write (12,*)  'Sp  '  ,nuin(ptrsp)  ,mse{ptrsp) 
,Sp(ptrsp) ,cp(ptrsp) , ' 

(in(  j ,ptrsp) ,  j=l,num(ptrsp) ) 
write(12,*)  'Cp  '  ,nuin(ptrcp)  ,mse(ptrcp) 
,Sp(ptrcp),cp(ptrcp),'  ', 

(in(  j,ptrcp) ,  j=l,nuin(ptrcp) ) 
write ( 12,* )  ' *****♦★***♦*♦***★★*★*********** 

write (12,*)  '  ' 
write ^2,*)  '  ' 
continue 

nummse  =  real(varsmse)/60.0 
numsp  «  real ( varssp) /60 . 0 
numcp  =  real (varscp) /60 . 0 
inseeer=  real(cuinemse) /60.0 
cpeer  =  real (curaecp)  /60.0 
speer  =  real(cumesp)  /60.0 
msepm  =  l-(mseeer/nainmse) 
cppin  “  1- (cpeer /numcp) 
sppm  =  l-( speer /numsp) 

write (12,*)  'The  avg  number  of  vars  using  MSB', 

'  was  ' ,  nummse 

write (12,*)  'The  avg  number  of  extraneous  vars  from 
MSB ' , 

'  was ' ,  mseeer 

write (12,*)  '******  The  PM  for  MSB  was  ',  msepm,' 
write (12,*)  '  ' 

write (12,*)  'The  avg  number  of  vars  using  Sp  was  ', 
numsp 

write (12,*)  'The  avg  number  of  extraneous  vars  from 
Sp' , 

'  was  ' , speer 

write(12,*)  *******  The  PM  for  Sp  was  ',  sppm,' 
«****«' 

write (12,*)  '  ' 

write (12,*)  'The  avg  number  of  vars  using  Cp  was', 
numcp 

write (12,*)  'The  avg  number  of  extraneous  vars 
from  Cp' , 

'  was  ', cpeer 

write (12,*)  '******  The  PM  for  Cp  was  ',  cppm, ' 
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write (12,*)  '  ' 

wcx^o  ( 12  /  ^ ^  * 

write(12, *)  'Correct  Vars  (0-3,  >wn)  -VS-  ' 

+  'Extraneous  Vars  (0-.‘ , across )  ' 

write (12,*)  '  ' 

write (12,*)  '  MSE  TABLE' 
write(12,*)  '  ' 

do  100  i=0,3 

write (12,*)  (chartinse(i,  j  ) ,  j=0, 3 ) 

100  continue 

write (12,*)  '  ' 
write (12,*)  '  ' 
write (12,*)  'Sp  TABLE' 
write (12,*) 
do  110  i=0,3 

write ( 12,*)  (chartsp(i, j ) , j=0,3) 

110  continue 

write (12,*)  '  ' 
write (12,*)  '  ' 
write (12,*)  'Cp  TABLE' 
do  120  i=0,3 

write  (12,*)  (chartcp(i, j) , j=0,3) 

120  continue 

write (13,*)  (DesignPoint-2) , '  ',insepin, ' 

+  sppm, '  ',cppin 

50  Continue 
Close(ll) 

Close ( 12 ) 

Close (13) 

GO  TO  1200 

*  Error  trap:  ****************************************** 
1000  Continue 

Print  1100,  '+++  ERROR  WHILE  OPENING  FILE  +++', 

+  '  error  code  «  ' ,  lERROR 

1100  F0RMAT(/1X,  A/  IX,  A,  18/) 

1200  CONTINUE 

Print  *, 'Counting  complete.  ',  NewOut, '  written.' 
END 
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***************  fortran  program  COUNT3.FOR  ************* 
******************************************************** 

SUBROUTINE  COUN^J  (NewOut) 

integer  num(63) , i, j ,k,ptnnse,ptrsp 
integer  pt rep , var smse , var s sp , varsep 
integer  check(6) 

integer  n , emse , esp , ecp , DesignPoint 
integer  ccp,cmse,csp 

integer  chartmse{0: 3,0:3 ) ,chartcp( 0:3, 0:3) 

integer  chartsp(0: 3,0:3) 

real  M£E;63) ,Sp{63) ,cp(63) ,r2{63) 

real  minmse , minsp , mincp , nunanse , numep , numsp 

real  inseeer,cpeer,speer 

real  inscpin,cppm,sppm 

character*2  in(6,63) 

character*20  NewOut 

check(l)=l 

check (2 )-7 

check(3)=22 

check(4)=42 

check(5)=57 

check(6)*63 

DesignPoint=2 

do  7  i  =  0,3 
do  3  k  =  0 , 3 

chartmse ( i , k ) =0 
chartcp{i,k)=0 
chartsp(i,k)=0 
continue 
continue 

varsinse=0 
varssp  =0 
varsep  =0 
cuinemne=0 
cumes}>=0 
cuinecp-=0 

open  (onit=ll,  file** ' temp. dat ' ,  status** ' old ' , 

+  lostat=IERROR,  err**1000) 

open  (unit=12,  f ile**NewOut,  status='new' , 

+  iostat=IERROR,  err=1000) 

open  (unit=13,  file=*'PM3.dat' ,  status='new' , 

+  iostat=IERROR,  err*=1000) 

write (13,*)'  DESIGNPOINT  MSE 

+  'SP  CP' 

Do  50  jj=2,64,2 
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Write(12,*)'  ' 

Write{12,*)'  ' 

Write(12,*) '**♦***************  DESIGN  POINT  ' 
DesignPoint, '  ******************** 


Write (12,*)'  ' 
DesignPoint=DesignPoint+2 


do  20  k=l,60 
do  10  i=l,63 

emse=0 

esp^O 

ecp=0 

read{ll, ' (1X,I1) 


,end>=40)nuin(i) 


IF  (nuin(i)  .EQ.l)  THEN 

read(ll,905,end=40)nuin(i) ,r2(i) ,cp(i) ,MSE(i) ,Sp(i) 

+  /m(l,i) 

905  format(7X,Il,4X,F10.8,3X,F9.5,3X,F9.7,2X, 

+  F10.8,2X,A2) 

ELSE 

IF  (nuin(i)  .EQ.2)  THEN 

read(ll,9io,end=40)nuin(i) ,r2{i) ,cp(i) ,MSE(i) ,Sp(i) 


+  /(m(D/X),]=l,2) 

910  fonnat(7X,Il,4X,Fl0.8,3X,F9.5,3X,F9.7,2X, 

+  F10.8,1X,2(1X,A2) ) 

ELSE 


+ 

915 

+ 


IF  (nuin(i)  .EQ.3)  THEN 

read(  ll,915,end=40  )nuin(i)  ,r2  (i)  ,cp(i)  ,MSE(i)  ,Sp(i) 
f (m(j»i)f j=l/3) 

fonnat(7X,Il,4X,F10.8,3X,F9.5,3X,F9.7,2X, 
F10.8,1X,3(1X,A2) ; 

ELSE 


+ 

920 

+ 


+ 

925 

+ 


IF  (nuin(i)  .EO.4)  THEN 

read(  ll,920,end=40)nuin(i)  ,r2(i)  ,cp(i)  ,MSE(i)  ,Sp(i) 
/(m(jri)/ j=l,4) 

fonnat(7X,Il,4X,F10.8,3X,F9.5,3X,F9.7,2X, 
F10.8,1X,4(1X,A2) ) 

ELSE 

IF  {nuin(i)  .EQ.5)  THEN 

read(  ll,925,end«40)niiin(i)  ,r2{i)  ,cp(i)  ,MSE(i)  ,Sp(i) 
#(«>(  j/i)f  j=lr5) 

fonnat(7X,Il,4X,F10.8,3X,F9.5,3X,F9.7,2X, 
F10.8,1X,5(1X,A2) ) 

ELSE 
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IF  (nuin(i;  .EQ.6)  THEN 

read(ll,930,end=40)num(i) ,r2(i) ,cp(i) ,M3E(i) ,Sp(i) 

/  ("»( j»i) » 

fonnat(7X,Tl,4X,F10.8,3X,F9.5,3X,F9.7,2X, 
F10.8,1X,6(1X,A2) ) 

ELSE 

Print  *,  'Number  of  variables  not  found;', 
'input  file  in  wrong  format 1 ' 

ENDIF 

ENDIF 

ENDIF 

ENDIF 

ENDIF 

ENDIF 

minmse=10000 
minsp  «10000 
mincp  =10000 
ptrmse=0 
ptrcp  =0 
ptrsp  =0 
do  30  j=  1,6 

if (mse(check{ j) ) .It.minmse)  then 
minmse=mse (check ( j ) ) 
ptrmse=check( j ) 
endif 

if (sp(check( j) ) .It .minsp)  then 
minsp=sp ( check { j ) ) 
ptrsp=check{ j ) 
endif 

if (cp{check( j ) ) .it.mincp)  then 
aiincp=cp  ( check  ( j ) ) 
ptrcp=check ( j ) 
endif 


continue 

continue 

continue 

varsmse“varsmse+num(ptrmse) 
varssp  =varssp  +num(ptrsp) 
varscp  =varscp  +num( ptrcp) 

do  70  n=l,num(ptrmse) 

if (m(n,ptrmse) .EQ. 'El' )  then 
emse=emse+l 

elseif (m{n,ptrmse) .eq. 'E2 ' )  then 
emse=emse+i 

elseif (m(n,ptrm3e) .eq. 'E3' )  then 
emse=emse+ 1 


else 

continue 

endif 

70  continue 

do  80  n“l,nuin(ptr8p) 

if  (in(n,ptrsp)  .eq. 'El' )  then 
esp«esp+l 

el8ei£(m(n,ptr8p} .eq. 'E2 ' )  then 
esp^esptl 

elseif  (ni(n,ptrsp)  .eq. 'E3' )  then 

esp“esp+i 

else 

continue 

endif 

80  cc  itinue 

do  90  n"l,num(ptrcp) 

if  (in(n,ptrcp)  .eq. 'El' )  then 
ecp«ecp+l 

elseif  (in(n,ptrcp)  .eq. 'E2' )  then 
ecp-ecp+1 

elseif (m(n,ptrcp) .eq. 'E3' )  then 

ecp«ecp+i 

else 

continue 

endif 

90  continue 

cumemse^cumemse+emse 
cume spocume sptesp 
cuinecp“cuinecp+ecp 
cinse>«num  ( pt  rmse )  -emse 
ccp«*nuin  ( pt  rep )  -ecp 

- -  C8p“nuni(ptrsp)-esp 

chartmse  ( emse ,  emse )  •^chartrose  ( emse ,  emse ) +1 
chartep { cep , ecp ) "chartep { cep , ecp ) + 1 
chartsp  ( esp ,  esp )  »chart  sp  ( esp ,  esp )  -t-1 


write(12,*)  'MSB' ,nujn(ptrmse)  ,mse(ptrmse) 

+  ,Sp(ptrms9) ,cp(ptrmse) , '  ', 

+  (m( j,ptrmse) , j"l,num(ptrmse) ) 

write(12,*)  'Sp  ' ,nuro(ptrsp) ,mse(ptrsp) 

■r  ,Sp(ptrsp),cp(ptr8p), '  ', 

+  (m( j,ptrsp) , j»l,num(ptrsp) ) 

write(12,*)  'Cp  ' ,num(ptrcp) ,mse(ptrcp) 

+  ,Sp(ptrcp),cp(ptrcp),'  ', 

+  (m{ j,ptrcp) , j»l,num{ptrcp) ) 

write (12,*)  '************************************* '  ^ 

•f  '***«******************' 
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+  +  +++  ++  + 


20 


write (12,*)  '  ' 
write (12,*)  '  ' 
continue 

nummse  =  real (varsmse) / 60.0 
numsp  “  real(varssp)/60.0 
numcp  =  real(varscp) /60.0 
mseeera  real (cumemse) / 60.0 
cj>eer  “  real  (cumeep)  /60.0 
speer  “  real(cumesp)  /60.0 
msepm  =  l-(inseeer/nunm)se) 
eppm  «=  l-(cpeer/numcp) 
sppm  “  l“(speer/nuinsp) 

write(12,*)  'The  avg  number  of  vars  using  MSE', 

+  '  was  ' ,  nummse 

write (12,*)  'The  avg  number  of  extraneous  vars 
from  MSE', 

'  was  ' ,  mseeer 

write ( 12, *)' ***  The  PM  for  MSE  was  ', msepm, '  ****' 
write (12,*)  '  ' 

write (12,*)  'The  avg  number  of  vars  using  Sp  was', 
numsp 

write (12,*)  'The  avg  number  of  extraneous  vars 
:rom  Sp' , 

'  was  ' , speer 

write( 12,*) '******  The  PM  for  Sp  was  ',sppm, '  ***' 
write (12,*)  '  ' 

write (12,*)  'The  avg  number  of  vars  using  Cp  was', 
numcp 

write (12,*)  'The  avg  number  of  extraneous  vars 
from  Cp' , 

'  was  ',cpeer 

write( 12 ,*) '******  The  PM  for  Cp  was  ',cppm, '  ***' 
write (12,*)  '  ' 

write (12,*)  '  ' 

write (12,*)  'Correct  Vars  (0-3, down)  -VS-  ', 

+  'Extraneous  Vars  ( 0-3, across ) ' 

write (12,*)  '  ' 

write (12,*)  '  MSE  TABLE' 
write (12,*)  '  ' 

do  100  i=0,3 

write (12,*)  (chartmse(i, j ) , j=0,3) 

100  continue 

write (12,*)  '  ' 
write (12,*)  '  ' 
write (12,*)  'Sp  TABLE' 
write (12,*) 
do  110  i=0,3 

write(12,*)  (chartsp(i, j ) , ja0,3) 
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110 


continue 


write (12,*)  '  * 
write (12,*)  '  ' 
write (12,*)  'Cp  TABLE' 
do  120  i=0,3 

write  (12,*)  (chartcp(i, j ) , j“0,3 ) 

'.20  continue 

write(13,*)  (DesignPoint-2), '  ',insep)in, 

+  '  ',8ppni, '  ',cppm 

50  Continue 

Close(ll) 

Close (12) 

Close (13) 

GO  TO  1200 

*  Error  trap:  ****************************************** 
1000  Continue 

Print  1100,  '+++  ERROR  V/HILE  OPENING  FILE  +++', 
t  '  error  code  •=  ' ,  lERROR 

1100  F0RMAT(/1X,  A/  IX,  A,  18/) 

******************************************************** 
1200  CONTINUE 

Print  *, 'Counting  complete.  ',  NewOut,'  written.' 
END 


•kitlffiiiltitltltitiriHiitltlfkltititlfkltlfklfklilcIrlcIfk*************************** 

FILCOUNT.FOR 

*This  program  takes  SAS  R-Squared  listings  in  any  file  and 
♦extracts  models  with  the  lowest  MSB,  Cp,  and  Sp  and  then 
♦figures  the  performance  measure  (PM).  This  program  calls 
♦subroutines  Count 1. for  and  Count!. for  and  write  the  ♦calcu¬ 
lated  PM's  to  ♦PMl.dat  and  PM3.dat /  respectively. 

it*********************************************************** 

Character^20  Newin 
Character^20  NewOut 
Character^SO  Line 
CHARACTER  I,  J 
Integer  Var 
Logical  VarFlag 

5  Continue 

Print  ♦,'Neune  of  file  to' excunine?  (20  char  or  less;', 

+  '  '•*"  to  quit)'  j 

Read  (♦, ' (A20) ' )  Newin  j 
If  (Newln(ls 1) .EQ. '♦' )  GO  TO  999 
Print  *, 'Output  file?  (2b  char  or  less)' 

Read  (♦,'(A20)')  NewOut  , 

7  Continue  | 

Print  ♦, 'Number  of  extraneous  variables?  ( 1  or  3 
+ONLyi 1 ) '  I 

Read  (♦,  '(II)')  Var 

If  ( (Var.NE.l) .AND. (Var.NE.3) )  GO  TO  7 

9  Continue 

VarFlag  =  (Var.EQ.3) 

Open  (unit=10,  file=NewI)i,  status='OLD' , 

&  iostat=IERROR,  err^lOOO) 

Open  (unit=ll,  file='temp.dat ' ,  status^'NEW' , 

&  iostat=IERROR,  err=1000) 

10  Continue 
Read(10,200,END=888)  Line 
I  -  LINE  (8:8) 

J  =  LINE  (9:11) 

IF  (VarFlag)  GO  TO  777 

IF  ( (I.EQ. '1' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '2' ) .AND. (J.EO. '  '))  THEN 

WRITE  (11,201)  I 


/ 


WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '3' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '4' )*AND.(J.EQ. '  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ENDIF 

ENDIF 

ENDIF 

ENDIF 

GO  TO  10 

777  Continue 

IF  ( (I.EQ. '1' ) ‘AND. (J.EQ. '  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ELSE 

IF  ((I.EQ.'2')‘AND.(J.EQ.'  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '3' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
- - ELSE  - 

IF  ( (I.EQ. '4' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ELSE 

IF  ((I.EQ. '5'). AND. (J.EQ.'  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ELSE 

IF  ((I.EQ. '6'). AND. (J.EQ.'  '))  THEN 

WRITE  (11,201)  I 
WRITE  (11,200)  LINE 
ENDIF 
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ENDIF 


ENDIF 
ENDIF 
ENDIF 
ENDIF 
GO  TO  10 


200  Format  (A80) 

201  Format  {1X,A1) 

888  Continue 

Close (10) 

Close{ll) 

Print  'Filtering  complete  on  ',NewIn,'.  Counting 
+begun . ' 

IF  (VarFlag)  THEN 
Call  Counts (NewOut) 

Print  'Counting  complete.  PM"s  calculated  for', 

+  '  designpoints  with  3  extraneous  variables ' , 

+  '  and  written  to  PM3 . dat . ' 

Print  ' 

ELSE 

Call  Count 1 (NewOut) 

Print  * * ***, 'Counting  complete.  PM"s  calculated  for', 

+  '  designpoints  with  1  extraneous  variable ' , 

+  '  and  written  to  PMl.dat.' 

Print  *,'  ' 

ENDIF 
GO  TO  5 

999  Continue 

Print  *, 'Processing  complete.  Program  terminated.' 

Stop 

*  Error  trap;  ****************************************** 

1000  Continue 

Print  1100,  '+++  ERROR  WHILE  OPENING  FILE  +++', 

&  '  error  code  =  ' ,  lERROR 

1100  F0RMAT(/1X,  A/  IX,  A,  18/) 

GO  TO  5 

***1i*1i1t****1t*1fli1t1,1t1t**************-k****-kit*1r1t************* 

END 
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*************. ft  I!*************')!******************************* 

FILSTEPCOUNT.FOR 

♦This  progreiin  takes  SAS  Forward  Selection  Stepwise  listings 
♦in  any  file  and  extracts  models  according  to  Miller's 
— ^  ♦Method  and  then  figures  the  performance  measure  (PM).  The 

♦progreun  reads  1  extraneous  variable  data  files  from 
♦Stepl_^Input .dat  and  3  extranceous  variables  data  files  from 
♦Step3_Input.dat,  forms  a  temporary  file  called  TEMP.DAT  and 
♦then  calls  subroutines  StepCount 1 . f or  and  StepCount3 . f or  to 
♦analysis  the  data. 

it  It  ft  it  It  it  it  *  it  it  it  it  it  it  it  ic -It  it  1e  *  ifk  it  itie  it  it  It  it  it  It  it  it  it  it  it  it  it  it -kilt  it  it  It  It  it  it  it  it  It  it  itit  it  ititle  it 

Character^l4  Newin 
Character^20  NewOiit 
Character^SO  Line 
Character ♦!  T,  J 
Character^2  K 
Integer  Var 

Logical  VarFlag,  BatchFlag 
5  Continue 

Print  ♦, 'Interact ive( I)  or  Batch (B)  mode?  (I  or  B 
+only ) : ' 

Read  (♦, ' (Al) ' )Mode 

IF  ( (Mode.NE. 'I' ) .AND. (Mode.NE. 'B' ) )  Go  to  5 
BatchFlags'(Mode,EQ. 'B' ) 

IF  (BatchFlag)  Go  to  6 

Print  ♦,'Mame  of  file  to  examine?  (20  char  or  less;', 

+  '  "♦"  to  quit)' 

,  ,  Read  (♦,'(A20)')  Newin 

If  (Newln(l:l) .EQ. '♦')  GO  TO  999 
.'i  6  Continue 

‘V;  Print  ♦, 'Output  file?  (20  char  or  less)' 

V  Read  (♦,'(A20)')  NewOut 

7  Continue 

Print  ♦, 'Number  of  extraneous  variables?  (1  or  3 
' \  +ONLyi I ) ' 

3  Read  (♦,  '(II)')  Var 

If  ( (Var.NE.l) .AND. (Var.NE.3) )  GOTO  7 

, '  9  Continue 

VarFlag  «  (Var.EQ.3) 

IF  ( (VarFlag)  .AND.  (BatchFlag) )  THEN 

Open  (unit=9,  file«='Step3  Input.dat',  status='OLD' , 

+  iostat=IERROR,  err“T000) 

ELSE 

IF  ((.NOT. VarFlag). AND. (BatchFlag))  THEN 
Open  (unit“9,  file='Stepl  Input.dat', 

+  status* ' OLD ' , 


+ 


iostat=IKRROR,  err=1000) 


ENDIF 

ENDIF 

11  Continue 

IF  (EutchFlag)  Read( 9 , ' (A14 ) ' ,END=666 )  Newin 
Print  *, 'Filtering  begun  on  Newin, ' .Filtered  data 
+  'is  being  dumped  to  TEMP.DAT.' 

Open  (unit=10,  file=NewIn,  status='OLD' , 

&  iostat=IERROR,  err=i000) 

Open  (unit=ll,  file='teinp.dat ' ,  status='NEW' , 

&  iostat=IERROR,  err=*1000) 

10  Continue 

Read(10,200,END=888)  Line 
I  «  LINE  (5:5) 

J  =  LINE  (6:8) 

K  =  LINE  (4:5) 

IF  (VarFlag)  GO  TO  777 

IF  ( (I.EQ. '1' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '2' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '3' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '4' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '5' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '6' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '7' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '8' ) .AND. (J.EQ. '  '))  THEN 
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777 


WRITE  (11,200)  LINE 
ENDIF 

ENDIF 

ENDIF 

ENDIF 

ENDIF 

ENDIF 

ENDIF 

ENDIF 

GO  TO  10 

Continue 

IF  ( (I.EQ. '1' ) ‘AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ((I.EQ,'2').AND.(J.EQ.'  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '3' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '4' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '5' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ((I.EQ.'6').AND.(J.EQ.'  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ((I.EQ.'7').AND.(J.EQ.'  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '8' ) .AND. (J.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 
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IF  ( (I.EQ. '9' ) .AND. ( J.EQ. ' 
WRITE  (11,200)  LINE 
ELSE 


))  THEN 


IF  ( (K.EQ. '10' ) .AND. (J.EQ. / 
WRITE  (11,200)  LINE 
ELSE 

IF  ( (K.EQ. '11' ) .AND. (J.EQ. ' 
WRITE  (11,200)  LINE 
ELSE 

IF  ( (K.EQ. '12' ) .AND. (J.EQ. ' 

WRITE  (11,200)  LINE 
EKDIF 


ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
GO  TO  10 


200  Format  (A80) 

666  Continue 

BatchFlag- . FALSE . 

888  Continue 

IF  (BatchFlag)  THEN 
Close ( 10 ) 

Go  to  11 
ELSE 

Close(9) 


)  )  '?’BEN 

'))  THEN 

'))  THEN 
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Close (10) 

Close(ll) 

ENDIF 

Print  *, 'Filtering  complete.  Analysis  of  data  in' 
+  '  TEMP.DAT  has  begun.','  Analysis  results 

twill', 

+  '  be  dumped  to  ',Newout,'.' 

IF  (VarFlag)  THEN 

Call  StepcountS (NewOut ) 

ELSE 

Call  Stepcountl(NewOut) 

ENDIF 
GO  TO  5 

999  Continue 

Print  *, 'Processing  complete.  Progreun  terminated.' 
Stop 

*  Error  trap:  ****************************************** 

1000  Continue 

Print  1100,  '+++  ERROR  WHILE  OPENING  FILE  +++', 

&  '  error  code  *  ' ,  lERROR 

1100  F0RMAT(/1X,  A/  IX,  A,  18/) 

GO  TO  5 

**★******♦■*★★*******★*****★*♦♦******************♦★**★*** 


***************  fortran  program  MILLBETA.FOR  i*r************** 

ii’kieitititie'kicieic'kicleitie'kir'k'kicititie’kie’kirie'kit'kltititicitit'kie'k'k'itif’kirititititltifkitlelciticicii 

Logical  ErrFlag 
CALL  BETAl 
CALL  BETA3 

ErrFlag  =  .FAISE. 

Call  MILLTMl (ErrFlag) 

If  (ErrFlag)  Go  to  999 

Print  *,'TMSEP''s  calculated  for  designpoints  with', 

+  '  1  extraneous  variables  and  written  to 

+  MILLTM1.DAT.' 

Print  * , '  ' 

Call  MILLTM3( ErrFlag) 

If  (ErrFlag)  Go  to  999 

Print  *,'TMSEP''s  calculated  for  designpoints  with', 

+  '  3  extraneous  variables  and  written  to 

+  MILLTM3.DAT.' 

Print  ' 

999  Continue 

Print  *, 'Processing  complete.  Program  terminated.' 

STOP 

END 
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ii'kicleit'kitiriticie'kitlcitii'itifiticie'kitlt'kieifitie'kitlfkitit'klTit'kitifkitieitieltifitltititirititititifftif 

***************  FORTRAN  PROGRAM  MILLSAS.FOR  **************** 

*************<r********v**Tft****W*********Tllr******#***ir*****iHr** 

INTEGER  VarInModel,  DP,  REP,  ActualREP 

CHARACTER* 3  J 
CHARACTER* 6  filename 
CHARACTER* 11  Model lex 
CHARACTER* 17  Model3ey 
CHARACTER* 80  LINE 


OPEN  (unit«10,  file='MillerlBeta.sas' ,  status* 'NEW', 
+  iostat=IERROR,  err*1400) 

OPEN  (unit®ll,  file=»'Miller3Beta.sas' ,  status*'NEW' , 
+  iostat=IERROR,  err=1400) 

OPEN  (unit«12,  file='Stepl  all.dat',  status* ' OLD ' , 

+  iostat*IERROR,  err*lT00) 

OPEN  (unit*13,  file*'Step3  all.dat',  status*'OLD' , 

+  iostat*IERROR,  err*lT00) 


Do  60  DP=1,64 

VarInModel  *  0 
ActualREP  »  0 
filename  *  '  ' 

Model lex  *  ' 

Model3ex  *  ' 

IF  pP.EQ.l)  THEN 
filename  *  'Ol.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.2)  THEN 
filename  =  '02.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.3)  THEN 
filename  =  '03.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.4)  THEN 
fileneune  *  '04.dat' 
GO  TO  20 
ENDIF 
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IF  (DP.EQ.5)  THEN 
filename  =  '05.dat' 
GO  TO  10 
ENDxF 

IF  (DF.EQ.6)  THEN 
filename  *=  '06.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.7)  THEN 
. filename  =  ‘07 .dat ' 
GO  TO  10 
ENDIF 

IF  (DP.EQvB)  THEN 
filename  =  '08.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.9)  THEN 
filename  =  '09.dat' 
GO  TO  10 
ENDIF 

IF  (DP. EC. 10)  THEN 
filename  =  '10. dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.ll)  THEN 
filename  =  '11. dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.12)  THEN 
filename  *=  '12.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.13)  THEN 
filename  -  '13.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.14)  THEN 
filename  =  '14.dat' 
GO  TO  20 
ENDIF 

IF  {DP.EQ.15)  THEN 
filename  =  '15.dat' 


/ 


GO  TO  10 
ENDIF 

IF  (DP.EQ.16)  THEN 
filename  •=  '16.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.17)  THEN 
filename  ■  'IT.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.IS)  THEN 
filename  -*  •'18.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.19)  THEN 
filename  =  '19.dat' 
GO  TO  10 
ENDIF 

IP  (DP.EQ.20)  THEN 
filename  “  '20.dat' 
GO'  TO  20 
ENDIF 

IF  (DP.EQ.21)  THEN 
filename  =»  '21.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.22)  THEN 
filename  *  '22.dat' 
GO  TO  20 
ENDIF 

IF  {DP.EQ.23)  THEN 
filename  =«  '23.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.24)  THEN 
filename  =  '24.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.25)  THEN 
filename  -  '25.dat' 
GO  TO  10 
ENDIF 
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IF  (DP.EQ.26)  THEN 
filename  =  '26.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.27)  THEN 
filename  =  '27.dat' 
GO  TO  10 
ENDIF 

IF  {DP.EQ.28)  THEN 
filename  =  '28.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.29)  THEN 
filename  =  '29.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.30)  THEN 
filename  =  '30.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.31)  THEN 
filename  ='  '31.dat' 
GO  TO  10 
ENDIF 

IF  pP.EQ.32)  THEN 
filename  =  '32.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.33)  THEN 
filename  =  '33.dat' 
GO  TO  10 
ENDIF 

IF  pP.EQ.34)  THEN 
filename  =  '34.dat' 
GO  TO  20 
ENDIF 

IF  pP.EQ.35)  THEN 
filename  =  '35.dat' 
GO  TO  10 
ENDIF 

IF  pp.EQ.36)  THEN 
filename  =  '36.dat' 
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GO  TO  20 
ENDIF 

IF  (DP.EQ.37)  THEN 
filename  =  '37.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.38)  THEN 
filename  =  '38.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.39)  THEN 
filename  «=  '39.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.40)  THEN 
filencune  =  '40.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.41)  THEN 
filensune  *  '41.dat' 
GO  TO  10 
ENDIF 

IF  (CP.EQ.42)  THEN 
filename  =  '42.dat' 
GO  TO  20 
ENDIF 

IF  {DP.EQ.43)  THEN 
filename  =  '43.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.44)  THEN 
filenaune  =*  '44.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.45)  THEN 
fileneime  M5.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.46)  THEN 
filename  *  '46.dat' 
GO  TO  20 
ENDIF 


IF  (DP.EQ.47)  THEN 
filename  =  '47.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.48)  THEN 
fllen<aine  *»  '48.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.49)  THEN 
filename  =  '49.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.50)  THEN 
filename  =  '50.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.51)  THEN 
filen^nne  ■=  '51.dat' 
GO  TO  10 
ENDIF 

IF  (DP.EQ.52)  THEN 
filename  =  '32.dat' 
GO  TO  20 
ENDIF 

IF  (DP.EQ.53)  THEN 
filename  =  '53.dat' 
GO  TO  10 
ENDIF 

IF  pP.EQ.54)  THEN 
filename  =  '54.dat' 
GO  TO  20 
ENDIF 

IF  pP.EQ.55)  THEN 
filename  =  '55.dat' 
GO  TO  10 
ENDIF 

IF  pP.EQ.56)  THEN 
filename  =  '56.dat' 
GO  TO  20 
ENDIF 

IF  pP.EQ.57)  THEN 
filename  =  '57.dat' 


GO  TO  10 
ENDIF 

IF  (DP.EQ.58)  THEN 
filename  =  '58.dat' 

GO  TO  20 
ENDIF 

IF  (DP.EQ.59)  THEN 
filename  «  '59.dat' 

GO  TO  10 
ENDIF 

IF  (DP.EQ.60)  THEN 
filename  =  '60.dat' 

GO  TO  20 
ENDIF 

IF  (DP.EQ.61)  THEN 
filencime  *  '61.dat' 

GO  TO  10 
ENDIF 

IF  (DP.EQ.62)  THEN 
filename  ■  '62.dat' 

GO  TO  20 
ENDIF 

IF  (DP.EQ.63)  THEN 
filename  «=  '63.dat' 

GO  TO  10 
ENDIF 

IF  (DP.EQ.64)  THEN 
filencime  ■  '64.dat' 

.  GO  TO  20  _  _ 

ENDIF  - 

10  CONTINUE 

924  CONTINUE 

READ  (12, 925, END-1350)  LINE 

925  FORMAT(1X,A80) 

J  -  LINE(1:3) 

IF  (J.NE.'Rep')  GO  TO  924 

DO  30  REP-1,60 

READ  (12, 900, END-1000)  ActualREP,  VarInKodol,  Modellex 

900  F0RMAT(5X,I2,12X,I1,6X,A11) 

IF  ( REP. NE. ActualREP)  GO  TO  1100 
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901 

WRITE  (10,901)  FILENAME 

FORMAT  ( IX, 'FILENAME  NEW  ' ' ' ,A6, ' ' ' 

?') 

WRITE  (10,902) 

902 

FORMAT  (IX, 'DATA  NEW; ') 

903 

WRITE  (10,903) 

FORMAT  ( IX, ' INFILE  NEW; ' ) 

. 

904 

WRITE  (10,904) 

FORMAT  (IX, 'INPUT  SETNITM  Y  XI  X2  X3 

X4  El;') 

905 

WRITE  (10,905)  ActualREP 

FORMAT  (IX, 'IF  SETNUM"®' ,12, '  THEN 

DELETE; ' ) 

906 

IF  ( VarInModel. EQ.O)  THEN 

WRITE  (10,906) 

FORMAT  (IX, 'INTERCEP  =1;') 

907 

WRITE  (10,907) 

FORMAT  (1X,'PR0C  RSQUARE  DATA=NEW 

NOINT  B;') 

908 

WRITE  (10,903) 

FORMAT  (IX, 'MODEL  Y  =  INTERCEP;') 

ELSE 


WRITE  (10,909) 

909  FORMAT  { IX, 'PROG  RSQUARE  DATA=NEW  B; ' ) 

WRITE  (10,91.0)  Modellex,  VarInModel 

910  FORMAT  (lX,'MODELY=  ',A11,'  /INCLUDE®' ,11, ';' ) 

ENDIF 

WRITE  (10,*) 

30  CONTINUE 


GO  TO  50 


20  CONTINUE 

926  CONTINUE 

READ  (33,927,END=1375)  LINE 

927  FORMAT (IX, A3 0) 

J  =  LINE(1;3) 
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IF  (J.NE.'Rep'j  GO  TO  926 
DO  40  REP=1,60 

READ  (13,911,END=1200)  ActualREP,  VarInModel,  Model3ex 

911  F0RMAT(5X,I2,12X,I1,6X,A17) 

IF  ( REP. NE. ActualREP)  GO  TO  1300 

WRITE  (11,912)  FILENAME 

912  FORMAT  ( IX, 'FILENAME  NEW  '",A6,'";') 

WRITE  (11,913) 

913  FORMAT  ( IX, 'DATA  NEW; ' ) 

WRITE  (11,914) 

914  FORMAT  (IX, 'INFILE  NEW;') 

WRITE  (11,915) 

915  FORMAT  (IX, 'INPUT  SETNUM  Y  XI  X2  X3  X4  El  E2  E3;') 
WRITE  (11,916)  ActualREP 

916  FORMAT  (1X,'IF  SE'TNUM***' ,12,  '  THEN  DELETE;') 


IP  (VarlnModel.EQ.O)  THEN 
WRIT!:  (11,917) 

917  FORMAT  ( IX, ' INTERCEP  =1;') 

WRITE  (11,918) 

918  FORMAT  (1X,'PR0C  RSQUARE  DATA=NEW  NOINT  B; ' ) 
WRITE  (11,919) 

919  FORMAT  (IX, 'MODEL  Y  =  INTERCEP;') 

ELSE 

WRITE  (11,920) 

920  FORMAT  (1X,'PR0C  RSQUARE  DATA=NEW  B; ' ) 

WRITE  (11,921)  Model3ex,  VarInModel 

921  FORMAT  (iy,'MODELY«  ',A17,'  /INCLUDE-' , II, ';' ) 

ENDIF 

WRITE  (11,*) 

40  CONTINUE 

50  CONTINUE 
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60 


CONTINUE 


CLOSE  (10) 
CLOSE  (11) 
CLOSE  (12) 
CLOSE  (13) 


PRINT  *,  'Program  completed  successfully  with 
+  DP-1,'  designpoints  and  REP-1, 

+  'replications.' 

GO  TO  1500 


*  ERROR  TRAP*******************************’'*************** 

1000  CONTINUE 

PRINT  *, 'ERROR  WHILE  READING  STEP1_ALL.DAT.', 

+  'UNEXPECTED  END  OF  FILE.' 

GO  TO  1500 

1100  CONTINUE 

PRINT  *, 'STEI 1_ALL.DAT  IN  UNEXPECTED  FORMAT.', 

+  'REP  COUNTER  DOES  NOT  AGREE  WITH  FILE.' 

GO  TO  1500 

1200  CONTINUE 

PRINT  *, 'ERROR  WHILE  READING  STEP3_ALL.DAT.', 

+  'UNEXPECTED  END  OF  FILE.' 

GO  TO  1500 

1300  CONTINUE 

PRINT  *, 'STEP3_ALL.DAT  IN  UliEXPECTED  FORMAT.', 

+  'REP  COUNTER  DOES  NOT  AGREE  VtitH  FILE.' 

GO  TO  1500 

1350  CONTINUE 

PRINT  *, 'STEP1_ALL.DAT  IN  UNEXPECTED  FORMAT.', 

+  'DP  COUNTER  DOES  NOT  AGREE  WITH  FILE.' 

1375  CONTINUE 

PRINT  *, 'STEP3_ALL.DAT  IN  UNEXPECTED  FORMAT.', 

+  'DP  COUNTER  DOES  NOT  AGREE  WITH  FILE.' 

1400  CONTINUE 

PRINT  1401,  '+++  ERROR  WHILE  OPENING  FILE  +++', 

+  '  error  code  =  ' ,  lERROR 
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1401  FORMAT  (/IX,  A/  IX,  A,  18/) 

1500  CONTINUE 
STOP 
END 
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*  FORTRAN  PROGRAM  MILLTMl.FOR 

* 

*  This  progrpjn  is  designed  to  take  the  64  groups  of  60 

*  models  selected  via  Miller's  method  and  the  corresponding 

*  3840  data  sets  and  find  the  "real”  MSEP  for  each  of  the  32 

*  odd  designpoints  of  64  design  points. 

*************************************  ********  It**  ****** 

Subroutine  MILLTMl (ErrFlag) 

Integer  h,i, j ,k,p,r,s 
Integer  num 

Real  bO,  betas (4) 

Real  x('4,20)  ,x3exl(4,20)  ,ex,y 
Real  ypredmillers 
Real  ymsepmillers 
Real  yssepmillers 
Real  sumyssepmillers 
Real  sumdifmillers 
Real  dpymsepmillcrs 

Character*6  Infile 

Logical  ErrFlag 

Open  { unit=>l  1 ,  file®  ' MILLERlBETA .  DAT ' ,  status® '  old ' , 

+  iostat=IERROR,  err®1000) 

Open  ^anit*13, file® 'MILLTMl. DAT' , status='new' , 

+  iostat=IERROR,  err=1002) 

Write  (13,902) 

902  Format  (lX,'TMSEPs  calculated  for  the  Miller' 's 
+  method: ' ) 

Write  (13,901) 

901  Format  ( IX, 'DP' ,  9X, 'Miller "s ' ) 

Do  5  r=l,63,2 

If  (r.EQ.l)  then 
lnfile='0i.dat ' 

Else 

If  (r.EQ.3)  then 
Infile®' 03 .dat ' 

Else 

If  (r.EQ.5)  then 
Infile®'05.dat ' 

Else 

If  (r.EQ.7)  then 
Infile® '07. dat' 

Else 
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If  (r.EO.9)  then 
Inf ile= ' 09 .dat ' 

Else 

If  (r.EQ.ll)  then 
Infile=' ll.dat' 

Else 

If  (r.EQ.13)  then 
Inf ile=' 13.dat' 

Else 

If  (r.EQ.15)  then 
Inf ile=' 15.dat' 

Else 

If  (r.EQ.17)  then 
Infile=' 17.dat' 

Else 

If  (r.EQ.19)  then 
Inf ile=' 19.dat' 

Else 

If  (r.EQ.21)  then 
Infile='21.dat' 

Else 

If  (r.EQ.23)  then 
Inf ile“' 23.dat' 

Else 

If  (r.EQ.25)  then 
Inf ile=' 25.dat' 

Else 

If  (r.EQ.27)  then 
Inf ile=" 27.dat' 

Else 

If  (r.EQ.29)  then 
Infile='29,dat' 

Else 

If  {r.EQ.31)  then 
Inf  ile»» '  3 1 .  dat ' 

Else 

If  (r.EQ.33)  then 
Infile* ' 33 - dat ' 

Else 

If  (r.EQ.35)  then 
Infile* '35. dat' 

Else 

If  (r.EQ.37)  then 
Infile*' 37. dat' 

Else 

If  (r.EQ.39)  then 
Infile*' 39.dat ' 
Else 

If  (r.EQ.41}  then 
Infile* '41. dat ' 
Else 

If  (r.EQ.43)  then 
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Inf ile= ' 4 3 . dat ' 

Else 

If  (r.EQ.45)  then 
Inf ile= ' 45 . dat ' 

Else 

If  (r.EQ.47)  then 
Infile=' 47 .dat ' 

Else 

If  (r.EQ.49)  then 
Infile='49.dat' 

Else 

If  (r.EQ.51)  then 
Infile='51.dat' 

Else 

If  (r.EQ.53)  then 
Infile='53.dat ' 

Else 

If  (r.EQ.SE)  then 
Inf ile= '55.dat ' 

Else 

If  (r.EQ.57)  then 
Inf ile= ' 57 . dat ' 

Else 

If  (r.EQ.59)  then 
Infile='59.dat' 
Else 

If  (r.EQ.61)  then 
Infile* '61. dat' 
Else 

If  (r.EQ.63)  then 
Infile='63.dat ' 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 


150 


Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

E».dif 

Endif 

Open  ( unit»l 2 ,  f  ile^Inf  ile ,  status** '  old ' ,  iostat**IERROR 
+  err-1001) 

sumyssepmillers  **  0 
sumdifmillers  **  0 

Do  20  k=^l,60 

yssepmillers=0 

Read  (ll^*,end=1003)  num,  bO,  betas(l), 

+  betas (2) ,  betas ( 3 ) ,  betas ( 4 ) 


If ( ( (r.GE.17) .AND. (r.LE.32) ) .OR. (r.GE.49) )  then 
8  »  20 
Else 
s  «*  10 
Endif 

Do  50  h«=  1,3 

Read(12,*,end**1004)set,y,x(l,h)  ,x(2,h),x(3,h), 

+  X ( 4 , h ) , ex 

ypredinillers**  bO 
~  yactual  *»  0 
x3exl(l,h)=  x(l,h) 
x3exl(2,h)-  x(2,h) 
x3exl(3,h)B  x(3,h) 
x3exl(4,h)-  ex 

Do  60  p*»l,4 

yactual  =  yactual+x(p,h) 

Continue 

Do  70  psl,4 

ypredinillers  *  ypredmillers  +  betas{p)*x3exl(p,h) 
Continue 

yssepmillers  **  ( (ypredinillers-yactual)**real(2) ) 


+ 


+  yssepndllers 


50  Continue 

sumyssepmillers  =  sumj  ssepmilleriS  +  yssepraillers 
sumdifmillers  =  sumdifmillers  +  (s-nusn) 

20  Continue 

dpymsepmillers  -  sumyssepmillers  /  sumdifmillers 

Write( 13, 900 ;  r,  dpymsepmillers 
900  Format  ( IX, 12 ,5X,F10.6 ) 

Close  (12) 

5  Continue 

Close  (11) 

Close  (13) 

Go  to  130G 

******Error  trap********* ******************************* 


1000  Print  *,' Something '' s  wrong  with  MILLER1BETA.DAT.' 

Go  to  1100 

1001  Print  *,' Something '' s  wrong  with  Infile 
Go  to  1100 

1002  Print  *,'Can''t  seem  to  create  MILLTM1.DAT.' 

Go  to  1100 

1003  Print  *,  'MILLER1BETA.DAT  ir.  unexpected  format.' 
ErrFlag  =  .TRUE. 

Go  to  1300 

1004  Print  *,  'File  ', Infile,'  is  in  an  unexpected  format.' 
ErrFlag  =  .TRUE. 

Go  to  1300 

1100  Continue 

Print  1200,  '+++  ERROR  WHILE  OPENING  FILE  +++', 

+  '  error  code  =  ' , I ERROR 

1200  Format  (/IX,  A/  IX,  A,  18/) 

ErrFlag  *•  .TRUE. 

h********************************************************* 

1300  Continue 
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END 


♦  •***  ★  All- 

*  ' 

*  FORTRAN  PROGRAM  MILLTM3.FOR 

* 

*  This  program  is  designed  to  take  the  64  groups  of  60 

*  models  selected  via  Miller's  method  and  the  corresponding 

*  3840  data  sets  and  find  the  "real**  MSEP  for  each  of  the  32 

*  even  design  points  of  64  design  points. 

*  **********★*******«#***«'«**«'*'***********■¥**♦*****'********** 

Subroutine  MILLTK3 (ErrFlag) 

Integer  h,i, j,k,p,r,s 
Integer  num 

Real  bO,  betas (6) 

Real  y (4 , 20 ) ,x3ex3 (6,20 ) ,y,exl ,ex2,ex3 

Real  ypredmillers 

Real  ymsepmillers 

Real  yssepmillers 

Real  sutnyssepmillers 

Real  sumdifmillers 

Real  dpymsepmillers 

Character*6  Infile 

Logical  ErrFlag 


Open ( unit«»l  1 ,  file®  'MILLER3BETA . DAT ' ,  status® ' old ' , 

+  io3tat®IERRCR,err®1000 ) 

+  Open (unit®lJ, file®'MILLTM3.DAT' , status® 'new' , 

+  iostat®IERROR,  err®1002) 

Write  (13,902) 

-  - 902  - Format  (lX,'TMSEPs  calculated  for  Miller' 's  method:') 

Write  (13,901) 

901  Format  ( IX, 'DP' ,9X, 'Miller"s' ) 


Do  5  r®2,64,2 

If  (r.EQ.2)  then 
Infile-'02.dat' 
Else 

If  (r.EQ.4)  then 
Infile*'04 .dat ' 
Else 

If  (r.EQ.6)  then 
Inf ile®'06.dat ' 
Else 

If  (r.EQ.8)  then 
Infile® ' 08 .dat ' 
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Else 

If  (r.EQ.lO)  then 
Infile=' lO.dat' 

Else 

If  (r.EQ.12)  then 
Infile=' 12.dat ' 

Else 

If  (r.EQ.14)  then 
Infile-' 14 .dat' 

Else 

If  {r.EQ.16)  then 
Infile=' 16 .dat' 

Else 

If  (r.EQ.lG)  then 
Infile='lB.dat' 

Else 

If  'rj.EQ.2C)  then 
Inf  iile= '  20  .dat ' 

Else 

If  (;r.EQ.22)  then 
Inf ile= ' 22 . dat ' 

Else 

If  (r.EQ.24)  then 
infile='24 .dat' 

Else 

Ifi  (r.EQ.26)  then 
Inf ile= ' 26 . dat ' 

El'se 

If  (r.EQ.28)  then 
Infile“' 28.dat' 

Else 

If  (r.EQ.30)  then 
Inf ile=' 30.dat' 

Else 

If  (r.EQ.32)  then 
Tnfile“'32.dat' 

Else 

If  (r.EQ.34)  then 
Infile*®'  34  .dat ' 

Else 

If  (r.EQ.36)  then 
Infile=' 36 .dat ' 

Else 

If  (r.EQ.38)  then 
Inf ile-' 38 .dat ' 
Else 

If  (r.EQ.40)  then 
Infile“'40.dat ' 
Else 

If  (r.EQ.42)  then 
Infile='42.dat' 
Else 
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If  (r.EQ.44)  then 
Inf ile=' 44.dat' 

Else 

If  (r.EU«46)  then 
Inf ile= '46. dat ' 

Else 

If  (r.EQ.48)  then 
Infile='48.dat' 

Else 

If  (r.EQ.50)  then 
Infile='50.dat' 

Else 

If  (r.EQ.52)  then 
Inf  ile*»'52  .dat  • 

Else 

If  (r.EQ.54)  then 
Infile='54.dat' 

Else 

If  (r.EQ.56)  then 
Infile=»'56.dat' 

Else 

If  (r.EQ.58)  then 
Inf ile=' 58.dat' 

Else 

If  (r.EQ.60)  then 
Infile='60.dat' 
Else 

If  (r.EQ.62)  then 
Inf ile=' 62.dat ' 
Else 

If  (r.EQ.64)  then 
Infile='64 .dat ' 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
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Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Open ( nnit=12, f ile=Inf ile, status=' old' ,iostat=IERPOR, 
+  err=1001) 


sumyssepmill^jrs  =  0 
sumdifmillers  =  0 

Do  20  k=i,60 

yssepmillers^O 

Read  ( 11 ,  * ,end=1003  )nuin,bO,betas ( 1 ) , betas (2 ) , betas ( 3  ) , 
+  betas(4),  betas (5),  betas(6j 


If ( ( (r.GE.17) .AND. (r.LE.32) ) .OR. (r.GE.49) )  then 
s  =  20 
Else 
s  =  10 
Endif 

Do  50  h=  l,s 

Read(12,*,end=1004)s9t,y,x(l,h) ,x(2,h) , 

+  x(3,h) ,x(4,h) ,exl,ex2,ex3 

ypredmillers  =  bO 
yactual  =  0 
x3ex3(l,h)=  x(l,h) 
x3ex3{2,h)=  x(2,h) 
x3ex3(3,h)=  x(3,h) 
x3ex3(4,.h)=  exl 
x3ex3(5,h)=  ex2 
x3ex3(6,h)=  ex3 

Do  60  p“l,4 

yactual  >=  yactual+x{p,h) 

60  Continue 

Do  70  p-1,6 

ypredmillers  =■  ypredmillers  +  betas(p)*x3ex3(p,h) 
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Continue 


yssepmillers  =  ( (ypredmillers-yactual) **real (2 ) ) 
+  +  yasepmillers 

50  Continue 

sixmycsepiniilers  «=  sumyssepmillere  +  yssepmillers 
sumdifmillers  numdifmillers  +  (s-num) 

20  Continue 

dpymsepmillers  »  sumyssepmillers  /  sumdifmillers 

Write(13,900)  r,  dpymsepmillers 
900  Format  ( 1X,I2,5X,F10.6) 

Close  (12) 

5  Continue 

Close  (11) 

Close  (13) 

Go  to  1300 

★  ★♦♦♦♦Error  trap***************** ■'************************* 


1000  Print  *,' Something " s  wrong  with  MILLER3BETA.DAT.' 

Go  to  1100 

1001  Print  *,' Something '' s  wrong  with  Infile 
Go  to  1100 

1002  Print  *,'Can''t  seem  to  create  MILLTM3.DAT.' 

Go  to  1100 

1003  Print  *,  'MILLER3BETA.DAT  in  unexpected  format.' 
ErrFlag  *»  .TRUE. 

Go  to  1300 

1004  Print  ,  'File  ', Infile,'  is  in  an  unexpected  format. 
ErrFlag  =  .TRUE. 

Go  to  1300 

1100  Continue 

Print  1200,  '+++  ERROR  WHILE  OPENING  FILE  +++', 

+  '  error  code  »  ' , lERROR 

1200  Format  (/IX,  A/  IX,  A,  18/) 

ErrFlag  =  .TRUE. 

♦♦♦♦♦♦♦♦♦«♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 
1300  Continue 


****•»****»*************♦***'★******************************** 

**************  fortran  program  STEPCOUNTl.iOR  ************** 
************************************************************ 

SUBROUTINE  Stepcountl  (NewOut) 

integer  num,  numvar,h,i, j,k,n,p,q,r,s,t,v,w,x,y,2 
integer  emiller,  varsmiller,  cumemiller,  cmiller 
integer  chartiniller(0:3,0:3) ,  ReedCount 
real  avgvars,  avgevars,  laillerpm 
character*!  numchar 

character*2  m(9),  Model_var,  Good__model ( 5 ) 

character*20  NewOut 

logical  ModelNotFound,  EndofFile 

num=0 

einiller=0 

varsmiller-0 

cumeiniller=0 

cmiller=0 

ReadCount=0 

ModelNotFound= . TRUE . 

Endof File“ . FALSE . 


10 

20 


do  10  i=l,8 
ra(i)='  ' 

continue 

do  20  j=l,4 

Good_inodel  j  )  - '  ' 

continue 


40 

30 


900 


do  30  k.=0,3 
do  40  h=0,3 

chartmi ller { k , h ) =0 
continue 
continue 


open  (unit=ll,  file=*'te;mp.dat ' ,  starji 
iostat=IERROR.  err-*1000) 
open  (unit=12,  file=NewOut,  status' 
iostat=IERROR,  err=1000) 
open  (unit=13,  file='PM3tepl.dat' , 
iostat=IERROR,  err*=1000) 
write(13,*)'  DESIGNPOINT  MILLER' 


'old' 


lew' , 


latus='new' 


PM' 


ReadCount=ReadCount+l 

Read(  11, 900,end=90)nunichar,  Mcdel_var 
Format (4X,A1,4X,A2) 

IF  ( numchar. EQ. '1' )  THEN 
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num*! 

ELSE 

IF  (numchar .EQ. '2 ' )  THEN 
nui!i=2 
ELSE 

IF  ( numchar. EQ. '3' )  THEN 
num“3 
ELSE 

IF  (numchar. EQ. '4/)  THEN 
num“4 
ELSE 

IF  (numchar. EQ. '5 ' )  THEN 
num“5 
ELSE 

IF  (numchar. EQ. '6' )  THEN 
num=6 
ELSE 

IF  ( numchar. EQ. '7' )  TEEN 
num=7 
ELSE 

IF  (numchar. EQ. '8' )  THEN 
num=8 
ELSE 

Print  * ,  ' Unexpected  format  in  TEMP . DAT :  ' , 
+  'Numbers  1,2^3,  etc.,  not  found I ' 

Go  to  1300 
ENDIF 
EHDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 

IF  (num.NE.l)  THEN 

Print  *, 'Procescing  terminated.  Input  file  ' , 

+  'in  unexpected  format:  1st  number  must  be  1.' 

Go  to  1300 
ELSE 

m ( num ) “Mode l_var 
ENDIF 

Do  60  n=l,63,2 
Write (12,*)'  ' 

Write(12,*)'  ' 

Write  ( 12 ,-  * )  '  ' 

Write(12)*)  '********************  DESIGN  POINT  ', 

^  n,'  ********************' 

Write  (12,*)'  ' 

Write (12,*) 'Replication  #Vars  Model' 
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do  70  p=l,60 

80  Continue 

IF  (EndofFile)  THEN 

Print  'Unexpected  file  format!  File  doss  not 
+  'have  correct  #  of  design  points  and  reps.' 

Go  to  1300 
ENDIF 

ReadCount>=ReadCount+l 
Read(ll,900,enda90)  numchar,  Model__var 

IF  (numchar.EQ. ' 1 ' )  THEN 
num=l 
ELSE 

IF ( numchar . EQ . ' 2 ' )  TEEN 
nuin=2 
ELSE 

IF  (numchar.EQ. '3' )  THEN 
num=3 
ELSE 

IF  (numchar.EQ. '4 ' )  THEN 
num=4 
ELSE 

IF  (numchar.EQ. 'S' )  THEN 
num=5 
ELSE 

IF  (numchar.EQ. '6' )  THEN 
num=6 
ELSE 

IF  (numchar.EQ.'?')  THEN 
num-/ 

ELSE 

IF  (numchar.EQ. '8' )  THEN 
num=8 
ELSE 

Print  *,  'Unexpected  format  in  TEMP.DAT; 

+  'Numbers  1,2,3,  etc.,  not  found!' 

Go  to  1300 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 

110  Continue 

IF  (num.NE.l)  THEN 

m( num) “Model  var 
ELSE  “ 

continue 
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do  100  q=l,4 

IF  ((in(q)(l:l).NE.'R').MID.ModelNotFound)  THEN 
Goodjnodel  { q )  =in  ( q ) 
nuntvar =numvar  + 1 
ELSE 

McdelNotFound* . FALSE . 

ENDIF 

100  continue 

Write (12, 901 numvar,  (Good_model(r ) ,r=l,4 ) 
901  FORMAT  ('  ■ , 4X, 12 , IIX, 12 , 6X, A2 , IX, A2 , IX, 

•f  A2,1X,A2) 

Go  to  120  . 

ENDIF 
GO  TO  80 

120  continue 

IF  ( numvar . LE . 0 )  Go  to  140 
do  130  8=1, numvar 

if (m(s) .EQ. 'El' )  then 
emiller=emiller+l 
endif 

130  continue 

140  varsmiller=varsmiller+nai"var 

curoemiller=cumemiller+emiller 
cmiller=numvar-emiller 

chertmiller {cniller,emiller )=chartmiller (cmiller, 
+  emilier)+l 


do  150  t=l,8 
m(t)='  ' 

150  Continu-3 

m ( num ) =Mode l_var 

do  160  v=l,4 

Gocd_mode 1 ( v ) = '  ' 

160  continue 

emiller=0 

nuravar=0 

ModelNotFound= . TRUE . 

70  continue 

write (12,*)  '  ' 

write(12,*)  '******************************♦****♦*', 
write (12,*)  '  ' 

IF  (varsmiller.GT.O)  THEN 
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avgvars  ■=  real(varsiniller)  / 60.0 
avgevars  **  real(cumemiller ) /60 . 0 
millerpm  =  1- (avgovars /avgvars ) 

ELSF 

avgvars=0 

avgevars=0 

millerpm=0 

ENDIie’ 

write (12,*)  'The  avg  number  of  vars  using 
+  Miller' 's', 

+  '  method  was  ' ,  avgvars 

write (12,*)  'The  avg  number  of  extraneous  vars 
+  from', 

+  '  Miller' 's  method  was',  avgevars 

write (12,*)  '♦***♦*  The  PM  for  Miller' 's  was  ', 

+  millerpm,'  ******' 

write (12,*)  '  ' 
write (12,*)  '  ' 

write(12,*)  'Correct  Vars  (0-3,  down)  -VS-  ', 

+  'Extraneous  Vars  ( 0-3, across ) ' 

write (12,*)  '  ' 

write (12,*)  '  Table  for  Miller' 's  Method' 
write (12,*)  '  ' 

do  170  w«0,3 

write ( 12 , * )  (chartmiller ( w, x ) , x=0 , 3 ) 

170  continue 

Write (13,*)  n, '  ', millerpm 

do  180  y  =  0,3 
do  190  z  =  0,3 

chartmiller ( y , z ) =0 
190  continue 

180  continue 

varsmiller=0 

cumemiller=0 

60  Continue 
Close ( 11 ) 

Close (12) 

Close( 13 ) 

GO  TO  1200 

*  Error  trap:  ********************************** ^it****** 
1000  Continue 

Print  1100,  •■>■++  ERROR  WHILE  OPENING  FILE  +++', 

+  '  error  code  =  ' ,  lERROR 

1100  F0RMAT(/1X,  A/  IX,  A,  18/) 
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’k'k'ititiric’ieitif'kit'kfe'ttit'kititit’k'kitiritirititititititititieltli'ifk'k'kitieieieie-kie'k'k’kieitit'kitie 

1200  CONTINUE 

Print  *, 'Counting  complete.  \  NewOut,'  written.' 
Go  to  1300 

90  Print*, 'End  of  File  encountered  at  line  ',ReadCount 
Print*, 'Design  Point: ' ,n, '  Replication: ' ,p 
num-l 

Model_var= ' * * ' 

Endof  File** .  TRUE . 

Go  to  110  . 


1300  CONTINUE 


T 


**************  fortran  program  STEPCOUNT3 . for  ************** 

.*★*♦**★******★**♦'#?**★★★★★*★★**' ifr****ifr*******Hr************T^*** 


SUBROUTINE  Stepcount3  (NewOut) 

integer  num,  nunivar,h,i, j,k,n,p,q,r,s,t,v,w,x,y,z 
integer  emiller,  varsmiller,  cumeroiller,  cmiller 
integer  chartmiller ( 0 : 3 , 0 : 3 ) ,  ReadCount 
real  avgvars,  avgevars,  millerpm 

character* 2  in(13),  Model_var,  Good_model ( 7 ) ,  numchar 

character*20  NewOut 

logical  ModelNotFound,  EndofFile 

nuin=0 

einiller=0 

varsmiller=0 

cumemiller=0 

ciniller=0 

ReadCount-0 

ModelNotFound= . TRUE . 

EndofFile- . FALSE . 


do  10  i=l,12 
m(i)**'  ' 

10  continue 

do  20  j=l,6 

Goocl_inode  1  ( j  )  = '  ' 

20  continue 

do  30  k=0,3 
do  40  h=0,3 

chartmiller ( k, h ) =0 
40  continue 

30  continue 

open  (unit=ll,  file='temp.dat ' ,  status='old' , 

+  “  iostat=IERROR,  err=1000) 

open  (unit=12,  file=NewOut,  status® ' new' , 

+  iostat=IERROR,  err=1000) 

open  (unit=13,  file® 'PMstep3.dat ' ,  status='new' , 
+  iostat®IERROR,  err=1000) 

write(13,*)'  DESIGNPOINT  MII,LER"S  PM' 


ReadCount =ReaaCount+ 1 

Read  (11,900,  end® 9  0 )  numchar ,  Modc!l_var 
900  Format (3X,A2,4X,A2) 

IP  ( numchar. EQ. '  1')  THEN 
num®l 
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ELSE 

IF  (numchar.EQ.'  2')  THEN 
nuin=2 
ELSE 

IF  (numchar.EQ.'  3'^  THEN 
num=3 
ELSE 

IF  (numchar.EQ.'  4')  THEN 
num*4 
ELSE 

IF  (numchar.EQ.'  5')  THEN 
num*5 
ELSE 

IF  (numchar.EQ.'  6')  THEN 
num=6 
ELSE 

IF  (numchar.EQ.'  7')  THEN 
num=»7 
ELSE 

IF  (numchar.EQ.'  8')  THEN 
num=8 
ELSE 

IP  (numchar.EQ.*  9')  THEN 
num=9 
ELSE 

IF  (numchar.EQ. ' 10 ' )  THEN 
num*X0 
ELSE 

IF  (numchar.EQ. ' 11 ' )  THEN 
num=ll 
ELSE 

IF  ( numchar .  EQ . '  1 2 ' )  T  1* 
numsl2 
ELSE 

Print  *,  'Unexpected  format  in  DAT: 

+  'Numbers  1/2,3,  etc.,  r.  >  foundl 

Go  to  1300 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENLJF 

IF  (num.NE.l)  THEN 

Print  *, 'Processing  terminated.  Input  file  ' 
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'in  unexpected  format:  1st  number  must  be  1 
Go  to  1300 
ELSE 

m  ( nuin )  =Mode  l_var 
ENDIF 

Do  60  n=2,64,2 
Write (12,*)'  ' 

Write (12,*)'  ' 

Write ( 12 , *  )  '  ' 

Write{12,*)  '^****************** **  DESIGN  POINT 

n, '  ******************** f 

Write  (12,*)'  ' 

Write ( 12 ,*) 'Foplication  #Vars  Model' 
do  70  p=l,60 
Continue 

IF  (EndofFile)  THEN 

Print  *, 'Unexpected  file  format  1  File  does  not 
'have  the  correct  #  of  design  points  and  reps 
Go  to  1300 
ENDIF 

ReadCount=ReadCount+l 

Read(ll,900,end=90)  nuinchar,  Model_var 

IF  (nurochar.EQ. '  1')  THEN 
num«l 
ELSE 

IF ( numchar . EQ . '  2 ' )  THEN 
num=2 
ELSE 

IF  (numchar .EQ. '  3')  THEN 
num-3 
ELSE 

IF  (numchar .EQ. '  4')  THEN 
num=4 
ELSE 

IF  ( numchar. EQ.  '  5')  '.'.{Fill 
num“5 
ELSE 

IF  ( numchar. EQ. '  6')  THEN 
num«6 
ELSE 

IF  ( numchar. EQ. '  7')  THEN 
num=7 
ELSE 

IF  (numchar. EQ. '  8')  THEN 
num=8 
ELSE 

IF  ( numchar. EQ. '  9')  THEN 
num«9 

166 


80 

+ 


ELSE 

IF  (numchar.EQ. ' 10' )  THEN 
num*10 
ELSE 

IF  (numchar.EQ. '11' )  THEN 
num*>ll 
ELSE 

IF  (numchar.EQ. '12' )  THEN 
num«12 
ELSE 

Print  *,  'Unexpected  format  in  TEMP.DAT: 

+  'Numbers  1,2,3,  etc.,  not  found  I' 

Go  to  1300 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 
ENDIF 

Continue 

IF  (num.NE.l)  THEN 

m(num) “Model  var 
ELSE  " 

continue 
do  100  q“l,6 

IF  ((m(q)(l;l).NE.'R').AND.ModelNotFoiind)  THEN 
Good_mode 1 ( q ) “m ( q ) 

'  .  numvar“numvar+l  - 

ELSE 

ModolKotFound- . FALSE . 

ENDIF 
continue 

Write( 12,901 )p,  numvar,  (Good  model (r) ,r"l,6) 
FORMAT  ('  ',4X,I2,llX,l276X,A2,lX,A2,iX, 
A2,1X,A2,1X,A2,1X,A2) 

Go  to  120 

ENDIF 
GO  TO  80 

120  continue 

IF  ( numvar . LE . 0 )  Go  to  140 
do  130  gol, numvar 

i£{ (m{8) .EQ- 'El' ).0R.(m{8) .EQ. 'E2' ) .OR. 

(m(8).BQ.'E3'))  then 
eniller-emlllertl 


110 


100 

901 

■f 
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endif 

130  continue 

var smi 1 ler “var smi 1 ler+numva r 
cumemi  1  ler=cuineinil  ler+emi  1  ler 
cmiller^numvar-emiller 

cl;a  rtmil  1  er  ( cir.il  ler ,  emil  ler )  »«chartiniller  ( cmiller , 
emiller )+l 


do  150  t«l,l? 

150  Continue 

m  ( num )  -^Model^var 

do  160  v-1,6 

Good_cnodel  ( v)“'  ' 

160  continue 

emi  11.er“0 
numvar“0 

Mode iNot Found- . TRUE . 

70  continue 

write (12,*)  '  ' 

write(12,*)  '**♦*••*♦*****♦************************', 

4.  »**********************************i***** 

write (12,*)  »  » 

i:’’  (varstniller.GT.O)  THEN 

avrgvare  -  real  ( varsmiller ) /60 . 0 
avgevars  -  real(cumemiller)/60.0 
nillerpm  -  1- (avgevars/avgvar? ) 

ELSE 

avgvars-O 

avgevirs»f) 

millerpm-O 

ENDIF 

wrlte(12,*)  'The  avg  number  of  vars  using  Miller' 's', 

+  '  method  was  ' ,  avgvars 

write(12,*)  'The  avg  number  of  extraneous  vars  from', 

♦  '  Miller' 's  method  was',  avgevars 

write(12,*)  '******  The  PM  for  Miller"a  was  ', 

+  millerpm, '  ******* 

write(12,*)  '  ' 
write (12,*)  '  ' 

write{12,*)  'Correct  Vars  (0-3,  down)  -VS-  ', 

•f  'Extraneous  Vars  (0-3, across) ' 

write(12,*)  '  ' 


140 


•f 
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write (12,*)  '  Table  for  Miller' 's  Method' 
write(12,*)  '  ' 

do  170  w=0,3 

write(  12,*)  (chartmiller(w,x)  ,x’»0,3) 

170  continue 

Write (13,*)  n, '  ',millerpin 

do  180  y  0,3 
do  190  z  ■  0,3 

chartmiller(y, z)*0 
190  continue 

180  continue 

varamiller^O 

cuineiniller“0 

60  Continue 
Close ( 11 ) 

Close(12) 

Close ( 13 ) 

GO  TO  1200 

*  Error  trap:  ******♦**♦♦****••**************************'' 

1000  Continue 

Print  1100,  '++•*■  ERROr  WHILE  OPENING  PILE  +++', 

+  '  err«  *  code  ■  ' ,  lERROR 

1100  F0RMAT(/1X,  A/  IX,  A,  li/) 

******************************************************** 

1200  CONTINUE 

Print  *, 'Counting  complete.  ',  NewOut,'  written.' 

^ _ Go  to  1300  _ _ _ _ 

90  Print*, 'End  of  Pile  encountered  at  line  ',ReadCount 
Print*, 'Design  Point: ',n,'  Replication: ' ,p 
num«l 

Model__var«' ** ' 

Endof File*. TRUE. 

Go  to  110 

1300  CONTINUE 
END 
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***■************'******'»******«******»****««***********«****** 

*  TMSEP.FOR  * 

*  This  program  takes  SAS  R-Squared  listings  in  any  file 

*  (with  switches  M3E,  SP,  CP,  and  B)  and  extracts  models 

*  with  the  lowest  MSE,  Cp,  and  Sp.  Then,  using  the  original 

*  data  files  (Ol.dat,  02 .dat , . . . , 64 .dat ) ,  it  calculates  the 

*  theoretical  performance  measure  (TMSEP).  This  program 

*  calls  subroutines  *  *  TMSEPl.FOR,  TMSEP2.FOR,  TMSEP3.FOR, 

*  and  TMSEP4.FOR  and  writes  the  calculated  TMSEF's  to 

*  TMSEP.DAT. 

•kirit'kltleieitititititititlt'k'kitititlrirititifltititititititit'kltititltitit'kitititititiUtitltifitititltitit'kir'k 

Character*20  Newin 
Character* 132  Line 
CHARACTER  I,  J,  K,  L 
Integer  Var 

Logical  VarFlag,ErrFlag 
5  Continue 

Print  *,'Name  of  file  to  examine?  (20  char  or  less;', 

+  .  to  quit) ' 

Read  (*,'(A20)')  Newin 
If  (Newln(l:l) .EQ. '*' )  GO  TO  999 

7  Continue 

Print  *, 'Number  of  extraneous  variables?  ( 1  or  3 
+ONLYI 1 ) ' 

Read  (*, ' (II) ' )  Var 

If  ( (Var.NE.l) .AND. (Var.NE.3) )  Go  To  7 
VarFlag  “  (Var.EQ.3) 


Open  (unit-lO,  file^NewIn,  status-'OLD' , 

&  iostat=IERROR,  err=1000) 

Open  (unit»ll,  f ile»'temp.dat ' ,  statu8«'NEW' , 
&  iostat-IERROR,  err»1000) 

10  Continue 

Read(10,200,END=888)  Line 
I  «  LINE  (14:14) 

J  -  LINE  (15:17) 

K  -  LINE  (9:9) 

L  -  LINE  (10:12) 

IF  (VarFlag)  GO  TO  777 

IF  ( (I.EQ. '1' ) .AND. (J.EO. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 
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‘  ) )  THEN 


IF  ( (T.EQ. '2' ) .AND. (J.EQ. ' 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '3' ) ‘AND* (J*EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (I.EQ. '4' ) .AND. (J.EQ. '  '))  TEEN 

WRITE  (11,200)  LINE 
ENDIF 

EKDIF 

ENDIF 

ENDIF 

GO  TO  10 

777  Continue 

IF  ( (K.EQ. '1' ) ‘AND. (L.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ((K.EQ.'2').AND.(L.EQ.'  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (K.EQ. '3' ) .AND. (L.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (K.EQ. '4' ) .AND. (L.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ( (K.EQ. '5') .AND. (L.EQ. '  '))  THEN 

WRITE  (11,200)  LINE 
ELSE 

IF  ((K.EQ. '6'). AND. (L.EQ.'  '))  THEN 

WRITE  (11,200)  LINE 
ENDIF 

ENDIF 

ENDIF 

ENDIF 

ENDIF 
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ENDIF 


GO  TO  10 


200  Format  (A132) 

888  Continue 
Close(lO) 

Close  V 11 ) 

Print  *, 'Filtering  complete  on  '  ,NewI.i, ' . ' , 

+  '  TMSEP  calculations  begun. ' 

ErrFlag  =  .FALSE. 

IF  (VarFlag)  THEN 
Call  TMSEP3( ErrFlag) 

If  (ErrFlag)  Go  to  5 

Print  'TMSEP' '3  calculated  for  designpoints  with', 

+  '  3  extraneous  variables  and  written  to  TMSEP3.DAT.' 

Print  * , '  ' 

ELSE 

Call  TMSEP 1 (ErrFlag) 

If  (ErrFlag)  Go  to  5 

Print  *, 'TMSEP' 's  calculated  for  designpoints  with', 

+  '  1  extraneous  variables  and  written  to 

TMSEP1.DAT. ' 

Print  *, '  ' 

ENDIF 
GO  TO  5 

999  Continue 

Print  *, 'Processing  complete.  Program  terminated.' 

Stop 

*  Error  trap:  ****•*******♦♦♦*♦■***★'>***★♦**************** 

1000  Continue 

Print  1100,  '+4+  ERROR  WHILE  OPENING  FILE  +++', 

&  '  error  code  -  ' ,  lERROR 

HOC  F0RMAT(/1X,  A/  IX,  A,  18/) 

GO  TO  5 

************  **«t«r*****************«r  **********  *********** 

END 
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♦  FORTRAN  PROGRAM  TMSEPl.FOR 

* 

*  Tills  prograr.i  is  d&sioned  to  take  a  modifed  SAS  progam  and 

*  an  existing  data  set  and  find  the  "real"  MSEP  for  the 

*  models  chosen  by  mse,  sp,  and  cp  critoria. 


Subroutine  TMSEPl (ErrFlag) 


Integer  h,i,'i,k,p,r,8,ptrmse,ptrsp.ptrcp 
Integer  check(4)  ,n.im(i5) 

Real  bO ( 15 ) .  r2  ( 15 )  ,cp(  15 )  ,inse(  15 ) , 8p(  15 )  ,betas(4 , 15 ) 

Peal  x(4 , 20 ) ,x3exi (4 ,20) ,ex,y 

Real  mininse,mincp,minsp 

Real  ypredcp,ypredm8e,ypredsp 

Real  ymsepmse,ymsepsp,ymsepcp 

Real  yusepirtse,yssepsp,yssepcp 

Real  sumyssepmse, sumyssepsp, sumyssepcp 

Real  8umdifirise,8umdifsp,sumdifcp 

Real  dpym8epmse,dpyTnsep8p,dpyin3epcp 

Character*6  Infile 


Logical  ErrFlag 
check ( 1 )■! 
check(2)>“5 
check(3)*»ll 
check(4 )"15 


+ 

+ 


Open (unit=ll, file"  'TEMP.DAT' , status" 'jald'  , 
iostat*-IERROR,err“1000) 


jL'i 


Open  {unit"13, file" 'TMSEPl. DAT' , status" 'new' , 
iostat"IERR0R,err"1002)  I 


Write  (13,902) 

902  Format  (lX,'TMSEPs  calculated  for  the  following 
+  methods ; ' ) 

Write  (13,901) 

901  Format  ( IX, 'DP' ,9X, 'MSE' , 13X, 'SP' , 13X, 'CP' ) 

Do  5  r»l,63,2 

If  (r.EQ.l)  then 
Intile- '01. dat ' 

Else 

If  (r.EQ.3)  then 
Infile"' 03.dat ' 

Else 

If  (r.EQ.5)  then 
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Inf ile=' 05.dat' 

Else 

If  (i.EQ.7)  then 
Inf ile»' 07 .dat ' 

Else 

If  (r.EC.9)  then 
Inf  ile«" '  09  .dat ' 

Else 

If  (r.EQ.ll)  then 
Infile-' 11 .dat ' 

Else 

If  (r.EQ.13)  then 
Infile«' 13.dat' 

Else 

If  (r.EQ.15)  then 
Inf ile«' 15.dat' 

Else 

If  (r.EQ.l?)  then 
Infile-' 17.dat' 

Else 

If  (r.EQ.19)  then 
Infile-' 19.dat' 

Else 

If  (r.EQ.21)  then 
Infile-' 21. dat' 

Else 

If  (r.EQ.23)  then 
Infile-'23.dat ' 

Else 

If  (r.EQ.25)  then 
Infile«'25.dat ' 

Else 

If  (r.EQ.27)  then 
Infile“'27.dat ' 

Else 

If  (r.EQ.2S)  then 
Infile-' 29.dat ' 

Else 

If  {r.EQ.31)  then 
Infile- '31.dat' 

Else 

If  (r.EQ.33)  Chen 
Infile-' 33.dat ' 

Else 

If  (r.EQ.35)  then 
Infile-' 35.dat' 
Else 

If  (r.EQ.37)  then 
Infile-' 37. dat' 
Else 

If  (r.EQ.39)  then 
Infile»'39.dat' 
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Else 

If  (r.EQ.41)  then 
Inf ile»' 41 .dat ' 

Else 

If  {r.EQ.43)  then 
Inf  ile»*'43  .dat ' 

Else 

If  (r.BQ.45)  then 
Infile- '45. dat' 

Else 

If  (r.BQ.47)  then 
Infile-'47 .dat' 

Else 

If  (r.BQ.49)  then 
lnfile-'49.dat' 

Else 

If  (r.EC.51)  then 
Infile- '51.dat' 

Else 

If  {r.EQ.53)  then 
Infile- '53. dat' 

Else 

If  (r.EQ.55)  then 
Infile-' 55.dat' 

Else 

If  (r.EQ.57)  then 
Infile-' 57.dat' 

Else 

If  {r.EQ.59)  then 
Infile- '59. dat ' 
EXso 

If  (r.EQ.61)  than 
Infile- '61. dat' 
Else 

If  (r.EQ.63)  then 
Infile- '63. dat' 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
^  Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
ndif 
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10 


Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Open ( unit=12 , f ile=Inf ile, status= ' old 

err=1001) 


, iostat“IERROR, 


sumyssepmse  =  0 
sumyssepsp  ®  0 
sumyssepcp  =0 
sumdifmse  =>  0 
siomdifsp  “  0 
sumdifcp  =  0 

Do  20  k=l,60 

minmse  «  10000 
mir.sp  “  10000 
mincp  =»  10000 
yssepinse=0 
yssepsp  *0 
yssepcp  =0 
ptnnse  =  0 
ptrsp  ■  0 
ptrcp  “0 

Do  10  i=l,15 

Read  (11,*, end-1003)  num(i),  r2(i),  cp(l) 
+  mse(i),  sp(i),  b0(i),  betas (1,1),  beta8(2, 
+  betas ( 3 , i ) ,  betas ( 4 , i ) 

Continue 

Do  30  j-1,4 

If  (inse(check(j) )  .It. minmse)  then 
minmse  -  mse(check( j ) ) 
ptjnnse  -  check(j) 

Endif 

If  (sp(check( j) ) .It.minsp)  then 
minsp  -  Bp(check(j)> 
ptrsp  -  check(j) 
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H**’ 


Endif 

If  (cp(check( j ) ) .It.mincp)  then 
minca  ==  cp(check(j)) 
ptrcp  =  check(j) 

Endif 

30  Continue 

If  ( ( (r.GE.17)  .AND.  (r.I.E.32) )  .OR.  (r.GE.49) )  then 
8  ■  20 
Else 
8-10 
Endif 

Do  50  h-  1,8 

Read  (12,*, end-1004)  8et,y,x(l,h) ,x(2,h) ,x(3,h) , 

+  X ( 4 , h ) , ex 

ypredmse-  bO(ptnnse) 
ypredsp  »  bO ( ptrsp ) 
ypredcp  -  b0( ptrcp) 
yactual  «  0 
x3exl(l,h)-  x(l,h) 
x3exl(2,h)>>-  x(2,h) 
x3exl(3,h)-  x(3,h) 
x3exl(4,h)-  ex 

Do  60  p-1,4 

yactual  -  yactual+x(p,h) 

60  Continue 

Do  70  p=l,4 

ypredmse  «  ypredmse  +  betas (p,ptnnse)*x3exl(p,h) 
ypredsp  «  ypredsp  +  betas (p, ptrsp)  *x3exl(p,h) 
ypredcp  -  ypredcp  +  betas (p, ptrcp)  *x3exl(p,h) 
70  Continue 

yssepiase  «  ( lypredmse-yactual) **real(2 ) )  + 

+  yssepmse 

yssepsp  -  ((ypredsp  -yactual )**real (2 ) )  +  yssepsp 
yssopcp  -  ((ypredcp  -yactual ) **real(2 ) )  +  yssepcp 

50  Continue 

suntyssepmse  -  sumyssepmse  +  yssepmse 
suroyssepsp  -  sumyssepsp  +  yssepsp 
sumyssepcp  -  sumyssepcp  +  yssepcp 
sumdifmse  -  sumdifmse  +  (s-num(ptnnse) ) 
stundifsp  ®  sumdifsp  +  ( s-num( ptrsp ) ) 
sumdifcp  =  surodifcp  +  ( s-num( ptrcp ) ) 

20  Continue 


177 


dp^^nnsepmse  =  sumyssepmse  /  sumdifmse 

dpymsepsp  =  sumyssepsp  /  sumdifsp 

dpymsepcp  «  sumyssepcp  /  sumdifcp 

Write ( 13 , 900 )  r,  dpyinst?pmse,  dpymsepsp,  dpymsepcp 
900  Format  ( 1X,I2,5X,F10.6,5X,F10.6,5X,F10.6) 

Close  (12) 

5  Continue 

Close  (11) 

Close  (13) 

Go  to  1300 

******Error  trap**-***************************************** 


1000 

1001 

1002 

1003 

1004 


Print  * , ' Somethinc ' ' s  wrong  with  TEMP.DAT.' 

Go  to  1100 

Print  Something '' s  wrong  with  Infile 
Go  to  1100 

Print  *,'Can''t  seem  to  create  TMSEPlrDAT.' 

Go  to  1100 

Print  *,  'TEMP.DAT  in  unexpected  format.' 

Go  to  1300 

Print  *,  'File  ', Infile,'  is  in  an  unexpected  format.' 
Go  to  1300 


1100  Continue 

Print  1200,  '+++  ERROR  WHILE  OPENING  FILE  •*'++', 
+  '  error  code  ®  ' , TERROR 

1200  Format  (/IX,  A/  IX,  A,  18/) 

ErrFlag  =»  .  TRUE . 

1300  Continue 


END 
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**W  >***«*******'****•********«'***'***  ************************** 

*  FORTRAN  PROGRAM  TKSEP3.FOR 

* 

*  This  progreon  is  designed  to  take  a  modifed  SAS  progeun  and 

*  an  existing  data  set  and  find  the  "real"  MSEF  for  the 

*  models  chosen  by  mse,  sp,  and  cp  criteria. 

*  .  ' 

************************************************************ 

Subroutine  TMSEP3(ErrFlag) 

Integer  h,i,j,k,p,r,s, ptrmse , ptr sp , ptrcp 
Integer  check( 6 ) ,num(63 ) 

Real  b0(63) ,r2(63) ,cp(€3) ,mse(63) ,sp(63) ,betas(6,63) 

Real  x(4,20) ,x3ex3(6,20) /y,exl;ex2,ex3 

Real  minmse,mincp,minsp 

Real  ypredcp,ypredmse,ypredsp 

Real  ymsepmse,ymsepsp,ymsepcp 

Real  yssepmse,yasepsp,ys8epcp 

Real  sumyssepmse, sumysaepsp, sumyssepcp 

Real  sumdifnse,sumdifsp,3umdifcp 

Real  dpymsepm8e,dpymsepsp,dpymsepcp 

Character*6  Infile 

Logical  ErrFlag 

check(l)“l 

check(2)=7 

check(3)»22 

check! 4) =4 2 

check(5)=57 

check! 6) =63 

Open{unit=ll,,f  ile=  'TEMP.DAT' ,  status® 'old' , 
+iostat=IERROR, err=1000 ) 

-  Open  (unit"13,file='TMSEP3.DAT' ,status='new' , 

+ios t at =IERROR , er r= 1 0 0 2 ) 

Write  (13,902) 

902  Format  ( 1X7 'TMSEPs  calculated  for  the  following 
+  method., : ' ) 

Write  (13,901) 

901  Format  ( IX, 'DP' ,9X, 'MSE' , 13X, 'SP' , 13X, 'CP' ) 

Do  5  1=2,64,2 

If  (r.EQ.2)  then 
Infile- '02.dat' 

Else 

If  (r.EQ.4)  then 
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Infile=' 04 .dat ' 

Else 

If  (r.EQ.6)  then 
Inf ile»' 06 .dat ' 

Else 

If  (r.EQ.8)  then 
Inf ile=' 08.dat' 

Else 

If  (r.EQ.lO)  then 
Infile®' 10 .dat ' 

Else 

If  (r.EQ.12)  then 
Infila®' 12 .dat ' 

Else 

If  (r.EQ.14)  then 
Infile®' 14.dat' 

Else 

If  (r.EQ.16)  than 
I  Infile='16.dat' 

I  Else 

If  (r.EQ.18)  then 
Infile®' 18.dat' 

I  E  X  S6 

If  (r.EQ.20)  thei? 
j  Infile® '20.dat ' 

Els0 

I  If  (r.EQ.22)  then 
i  Infile® '22 .dat ' 

Else 

If  (r.EQ.24)  then 
Infile='24 .dat' 

Else 

If  (r.EQ.26)  then 
Infile='26.dat' 

Else 

If  (r.EQ.28)  then 
Inf il a® '28. dat' 

A^lse 

If  (r.EQ.30)  then 
Inf ile®' 30.dat ' 

Else 

If  (r.EQ.32)  then 
Inf ile® '3 2. dat' 

Else 

If  (r.EQ.34)  then 
Infile='34.dat ' 
Else 

If  (r.EQ.36)  then 
Inf ile® '36. dat' 
Else 

If  (r.EQ.38)  then 
Infile®' 3P.dat ' 
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Else 

If  (r.EQ.40)  then 
Infile" '40.dat' 

Else 

If  (r.EQ.42)  then 
Infile"'42.dat' 

Else 

If  (r.EQ.44)  then 
Inf ile“'44 .dat ' 

Else 

If  (r.nQ.46)  then 
Inf ile"'46 .dat ' 

Else 

If  (r.E0v48)  then 
Infile" '4 8.dat' 

Else 

If  (r.EQ.50)  then 
Infile“'50.dat ' 

Else 

If  (r.EQ.52)  then 
Infile" '52.dat' 

Else 

If  (r.EQ.54)  then 
Infile- '54.dat' 

Else 

If  (r.EQ.56)  then 
Infile-'56.dat ' 

Else 

If  {r.EQ.58)  then 
Infile- '58.dat' 

Else 

If  (r.EQ.60)  then 
Infile-' 60.dat' 
Else 

If  (r.EQ.62)  then 
Infile-'62.dat ' 
Else 

If  (r.EQ.64)  then 
Infile- '64.dat' 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
Endif 
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Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Endif 

Open ( unit-12, f ile-Inf ile , status- ' old ' 

err-1001) 


iostat-IERPOR, 


sumyssepmse  -  0 
sumyssepsp  -  0 
sumyssepcp  -  0 
sumdifmse  -  0 
suindifsp  -  0 
sumdifcp  «  0 

Do  20  k-1,60 

minmse  -  10000 
minsp  -  10000 
mincp  -  10000 
yssepmse-O 
yssepsp  -0 
yssepcp  =0 
ptx'inse  -  0 
ptrsp  -  0 
ptrcp  -  0 

Do  10  i-1,63 

Read  (11,*, end-1003)  nuin(i),  r2(i),  cp(l), 

+  ni8e(i),  8p(i),  b0(i),  betas(l,i),  beta8(2,i), 

+  betas (3,i),  betas (4,i), 

+  beta8 ( 5 , i ) ,  betas ( 6 , i ) 

10  Continue 

Do  30  j-1,6 

If (mse{check(j )) .It. minmse)  then 
minmse  -  mse(check( j ) ) 
ptrmse  -  check(j) 
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Endif 

If  (sp(check( j ) ) .It.minsp)  then 
ininsp  -  6p(chsck(j)) 
ptrep  "  check(j) 

Endif 

If  {cp(check( j) ) .It.mincp)  then 
mincp  “  cp(check(j)) 
ptrcp  ■  check(j) 

Endif 

30  Continue 

If ( { (r.GE. 17) .AND. (r.LE.32) ) .OR. (r.GE.49)  )  then 
8  -  20 
Else 
8-10 
Endif 

Do  50  h-  1,8 

Read  (12, *, end-1004)  set, y,x(l,h) ,x(2,h) , 
x(3,h)  ,x(4,h)  ,Gxl,ex2,ex3 

ypredmse-  bO(ptirmse) 
ypredsp  «  bojptrsp) 
ypredcp  -  b0( ptrcp) 
yactual  «  0 
x3ex3(l,h)-  x(l,h) 
x3ex3(2,h)«  x(2,h) 
x3ex3(3,h)-  x(3,h) 
x3ex3(4,h)-  exl 
x3ex3(5,h)-  ex2 
x3ex3(6,h)-  ex3 

Do  60  p-1,4 

yactual  «  yactual+x{p,h) 

60  Continue 

Do  70  p-1,6 

ypredmse  -  ypredmse  +  betas(p,ptrmse)*x3ex3(p,h) 
ypredsp  -  ypredsp  +  betas (p,ptrsp)  *x3ex3(p,h) 
ypredcp  -  ypredcp  -f  betas (p, ptrcp)  *x3ox3(p,h) 
70  Continue 

yssepmse  »  ( (ypredmse-yactual)**real(2) )  + 

■f  yssepmse 

yssepsp  -  {(ypredsp  -yactual )* ♦real (2 ) )  +  yrsepsp 
yssepcp  -  ((ypredcp  -yactual ) ♦♦real (2 ) )  +  yssepcp 

50  Continue 

sumyssepmse  -  sumyssepmse  +  yssepmse 
sumyssepsp  -  sumyssepsp  -I-  yssepsp 
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Bumyssepcp  «  sumyssepcp  +  yssepcp 
aumdifmse  ■  sumdifmse  +  (s-nuin(ptnn8e)  ) 
sumdifsp  ■«  sumdifsp  +  ( s-nuin(ptr8p)  ) 
sumdifcp  “  sumdifcp  +  ( s-num(ptrcp) ) 

20  Continue 

dpyin8epin8e  *■  suaiyssepmse  /  sumdifmse 

dpymsepop  -  sumyssepsp  /  sumdifsp 

dpymsepcp  «  sumyssepcp  /  sumdifcp 

Write ( 13, 900 )  r,  dpymsepmse,  dpymsepsp,  dpymsepcp 
900  Format  ( l.X,  12 , 5X,F10 . 6, 5X,F10 .6, 5X,F10 . 6 ) 

Close  (12) 

5  Continue 

Close  (11) 

Close  (13) 

Go  to  1300 

♦•♦♦♦♦Error  trap* *♦♦♦*♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦*♦♦*♦♦♦ ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 


1000 

1001 

1002 

1003 

1004 


Print  ♦,' Something '' s  wrong  with  TEliP.DAT.' 

Go  to  1100 

Print  ♦, 'Something' '8  wrong  with  ', Infile 
Go  to  1100 

Print  ♦,'Can''t  seem  to  create  TMSEP3.DAT.' 

Go  to  1100 

Print  ♦,  'TSMP.DAT  in  unexpected  format.' 

Go  to  1300 

Print  ♦,  'File  ', Infile,'  is  in  an  unexpected  format.' 
Go  to  1300 


1100  Continue 

Print  1200,  '+++  ERROR  WHILE  OPENING  FILE  +++', 
+  '  error  code  -  ' , lERROR 

1200  Format  (/IX,  A/  IX,  A,  18/) 

ErrFlag  ■  .TRUE. 


1300  Continue 


END 
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;  SAS  Program  ERROR 1_ALL.S AS 

9 

option  linesize=80; 
filename  new  'Ol.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data«new  mse  sp  cp; 
by  set; 

model  y=  xl  x2  x3  el  ; 

filename  new  '03.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data^-new  mse  sp  cp; 
by  set; 

model  y=  xl  x2  x3  el  ; 

filename  new  'OS.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data^new  mse  sp  cp; 
by  set; 

model  y™  xl  x2  x3  el  ; 

fileneune  new  '07.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data*nev»  mse  sp  cp; 
by  set; 

model  y«»  xl  x2  ;:3  el  ; 


filename  new  '63.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data=new  mse  sp  cp; 
by  set; 

model  y-  xl  x2  x3  el  ; 
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;  SAJ  Progr2un  ERROR 3_ALL.S AS 

9 

option  linesize-80; 
filename  new  '02.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3  @; 
proc  r square  data-new  mse  sp  cp; 
by  set ; 

model  y*  xl  x2  x3  el  e2  e3; 

filename  new  '04.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3 
proc  rsquare  data"new  mse  sp  cp; 
by  set; 

model  yB  xl  x2  x3  el  e2  e3; 

filename  new  '06.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3  §; 
proc  rsquare  data*=new  mse  sp  cp; 
by  set ; 

model  ya  xl  x2  x3  el  e2  e3  ; 

filename  new  '08.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3  §; 
proc  rsquare  data=new  mse  sp  cp; 
by  set; 

model  y»  xl  x2  x3  el  e2  e3  ; 


filename  new  '64.dat'; 
data  new; 
infile  new; 

inp’it  set  y  xl  x2  x3  x4  el  e2  e3  §; 
proc  rsquare  data“new  mse  sp  cp; 
by  set; 

model  ya  xl  x2  x3  el  e2  e3  ; 
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SAS  Program  MILLERlBETA.SAS 


FILENAME  NEW  'Ol.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El; 

IF  SETNUM''-  1  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  »  XI  X3  X2  /INCLUDE-3; 

FILENAME  NEW  'Ol.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM"-  2  THEN  DELET’’; 

PROC  RSQUARE  DATA-NEW  B. 

MODEL  Y  -  XI  X3  X2  /INCLUDE-3; 

FILENAME  NEW  'Ol.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El; 

IF  SETNUM"-  3  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  X2  /INCLUDE-1; 

FILENAME  NEW  'Ol.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM"-  4  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  X3  /INCLUDE-1; 

FILENAME  NEW  'Ol.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM"-  5  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  Xl  X3  /INCLUDE-2; 

FILENAME  NEW  'Ol.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM"-  6  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  Xl  X3  El  /lNCLUDE-3; 
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FILENAME  NEW  '03.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El; 

IF  SETNUM''=  1  THEN  DELETE; 

PROC  RSQUARE  DATA=NEW  B; 

MODEL  Y=X1X3X2E1  /INCLUDE»4; 

FILENAME  NEW  '03.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM''-  2  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  XI  X3  /INCLUDE-2; 

FILENAME  NEW  '03.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El; 

IF  SETNUM''-  3  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  XI  X2  X3  /INCLUDE-3; 

FILENAME  NEW  '03.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IP  SETNUM''-  4  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  X3  X2  Xl  El  /INCLUDE-4; 

FILENAME  NEW  '03.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM''-  5  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  »  X3  Xl  /INCLUDE-2; 

FILENAME  NEW  '03.dat'; 

DATA  NEW; 

INPILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM"-  6  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 
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MODEL  Y  =  X3  XI  X2  /INCLUDE=3; 


FILENAME  NEW  '63.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El; 

IF  SETNUM"=  1  THEN  DELETE; 

PROC  RSQUARE  DATA=NEW  B; 

MODEL  Y  =  XI  X3  X2  El  /INCLUDE=4; 

FILENAME  NEW  '63.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM 2  THEN  DELETE; 

PROC  RSQUARE  DATA=NEW  B; 

MODEL  Y  =  XI  X2  X3  /INCLUDE=3; 

FILENAME  NEW  '63.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El; 

IF  SETNUM*=  3  THEN  DELETE; 

PROC  RSQUARE  DRTA=NEW  B; 

MODEL  Y  =  X3  X2  XI  El  /INCLUDE=4; 

FILENAME  NEW  '63.dat'; 

DATA  NEW; 

INFILR  NEW; 

INPUT  SEI^UM  Y  XI  X2  X3  X4  El; 

IF  SETNUM,'' =  4  THEN  DELETE; 

PROC  RSQU^E  DATA=NEW  B; 

MODEL  Y  -  X3  X2  XI  /INCLUDE=3; 

FILENAME  NEW  '63.dat'; 

DATA  NEW;  \ 

INFILE  NEWle 

INPUT  SETNOTi  Y  XI  X2  X3  X4  El; 

IF  SETNUM*4  5  THEN  DELETE; 

PROC  RSQUARE  DATA*NEW  B; 

MODEL  Y  «  X2  XI  X3  /INCLUDE*3; 
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FILENAME  NEW  '63.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El; 

IF  SETNUM^=60  THEN  DELETE; 

PROG  RSQUARE  DATA=NEW  B; 

MODEL  Y  =  X2  Xl  X3  /INCLUDE=3; 
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?  SAS  Program  MILLER3UETA.SAS 

• 

FILENAME  NEW  ' 02 . dat ' ; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El  E2  E3; 

IF  SETNUM" =  1  THEN  DELETE; 

INTEP.CEP  «  1; 

PROC  RSQUARE  DATA=NEW  NOINT  B; 

MODEL  y  =  INTERCEP; 

FILENAME  NEW  * 02.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El  E2  E3; 

IF  SETNUM"=  2  THEN  DELETE; 

INTERCEP  =1; 

PROC  RSQUARE  DATA=NEW  NOINT  B; 

MODEL  Y  =  INTERCEP; 

FILENA!-^  NEW  '02.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El  E2  E3; 

IF  SETNUM"=  3  THEN  DELETE; 

PROC  RSQUARE  DATA=NEW  B; 

MODEL  y  =  X3  E2  X2  /INCLUDE=3; 

FILENAJIE  NEW  '02.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El  E2  E3; 

IF  SETNUM"*  4  THEN  DELETE; 

PROC  RSQUARE  DATA=NEW  B; 

MODEL  Y  *  X3  X2  XI  /INCLUDE=3; 

FILENAME  NEW  '02.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El  E2  E3; 

IF  SETNUM"*  5  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  *  XI  X3  El  /INCLUDE=3; 
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FILENAME  NEW  '04.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 

IF  SETNUH*-  1  THEN  DELETE; 

PROC  RSQUARE  DATA-NBW  B; 

MODPL  y  -  X2  X3  XI  /INCLUDE-3; 

FILENAME  NEW  '04.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4 
IF  SETNUM*'-  2  THEN  DELETE; 

PRCX:  RSQUAPE  DATA-NEW  B; 

MODEL  Y  -  X2  Xl  X3 

FILENAME  NEW  '04.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 

IF  SETNUM*-  3  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  «  X*?  X3  /INCLUDE-2; 

FILENAME  NEW  '04.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPin  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 

IF  SETNUM*-  4  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  X3  /INCLUDE-1; 

FILENAME  NEW  '04.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 

IF  SETNUM*-  5  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  E; 

MODEL  Y  -  X2  Xl  X3  /INCLUDE-3; 

FILENAME  NEW  '04.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  B2  E3; 

IF  SETNUM*-  6  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  X2  Xl  /INCLUDE-2; 


El  E2  E3; 
/INCLUDE-3; 
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• 

FILENAME  NEW  '64.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  XI  X2  X3  X4  El  E2  E3; 

IF  SETNUM''-  1  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  y  -  X2  X3  XI  /INCLUDE-3; 

FILENAME  NEW  '64.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3? 

IF  SETNUM"-  2  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  »  X2  X3  Xl  E3  E2  /INCLUDE-5; 

FliiENAME  NEW  '64.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 
ir  SETNUM"-  3  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  Xl  X2  X3  /INCLUDE-3; 

FILENAME  NEW  '64.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 

IF  SETNUM"-  4  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  »  X3  X2  /INCLUDE-2; 

FILENAME  NEW  '64.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 

IF  SETNUM"-  5  THEN  DELETE; 

PROC  RSQUARE  DATA-NEW  B; 

MODEL  Y  -  Xl  X2  X3  El  /INCLUDE-4; 


FILENAME  NEW  '64.dat'; 

DATA  NEW; 

INFILE  NEW; 

INPUT  SETNUM  Y  Xl  X2  X3  X4  El  E2  E3; 
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IF  SETNmi''=60  THEN  DELETE; 

PROC  RSQUARE  DATA=NEW  B; 

MODEL  y  »  X2  XI  X3  /INCLODE»3? 
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;  SAS  Program  PM.SAS 

9 

option  line8ize=80; 
filenaune  new  'PM.datV; 
data  new; 
infile  new; 

input  DP  ymse  y^p  yep  YMILLERS  CONST  A  B  AB  C  AC  BC  ABC  D 
AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  AFCPF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF; 


PROC  PRINT; 

TITLE  'Analysis  of  Performance  Measures  and  Significant 
Contributing  Factors ' ; 

ID  DP; 

VAR  ymse  ysp  yep  YMILLERS  CONST  A  B  AB  C  AC  BC  ABC  D  AD  BD 
ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDP  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF; 


proc  stepwise; 

model  ymse  »  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstay’^.Ol; 

proc  stepwise; 

model  ysp  »  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstay^.Ol; 

proc  stepwise; 

model  yep  -  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 


196 


CD  ACD  BCD  AECD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDP  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DBF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstay^.O!; 

proc  stepwise; 

model  YMILLER3  »  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstay^.Ol; 
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;  SAS  Program  STEP11_ALL.SAS 

f 

option  linesi2e=80  pagesize=57; 
filename  new  'Ol.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el; 
data  randset; 
set  dataset; 
rl*=RANNOR(0) ; 
r2=RANNOR(0) ; 
r3=RANNOR(0) ; 
r4=RANNOR(0) ; 

proc  stepwise  data=randset; 
by  set; 

model  y=  xl  x2  x3  el  rl  r2  r3  r4  /forward  slentry=l; 

filename  new  '03.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el; 
data  randset; 
set  dataset; 
rl»RAlINOR(0) ; 
r2«RANNOR(0) ; 
r3=RANNOR{0) ; 
r4=RANNOR(0) ; 

proc  stepwise  data^randset ; 
by  set; 

model  y=  xl  x2  x3  el  rl  r2  r3  r4  /forward  slentry^l; 


filename  new  '31.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  cl; 
data  randset; 
set  dataset; 
rl=RANNOR(0) ; 
r2=RANNOR(0) ; 
r3“RANNOR(0) ; 
r4»RANNOR(0) ; 

proc  stepwise  data«randset; 
by  set; 

model  y-  xl  x2  x3  el  rl  r2  r3  r4  /forward  slentry-l; 
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f 


SAS  Program  STfiPl2_ALL.SAS 

option  linesize-80  pagesize“57; 
filename  new  '33.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el; 
data  randset; 
set  dataset; 
rl«RANNOR(0) ; 
r2-RANNOR{0) ; 
r3=RANNOR(0) ; 
r4-RANN0R( 0 ) ; 

proc  stepwise  data«randset; 
by  set; 

model  y=  xl  x2  x3  el  rl  r2  r3  r4  /forward  slentr^''=>l; 

filen6une  new  '35.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el; 
data  randset; 
set  dataset; 
rl-RANNOR(O); 
r2»RANNOR(0) ; 
r3«RAMNOR(0); 
r4«RANN0R{ 0 J ; 

proc  stepwise  data«randset ; 
by  set; 

model  y"  xl  x2  x3  el  rl  r2  r3  r4  /forward  slentry**!; 


• 

filename  new  '63.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el; 
data  randset; 
set  dataset; 
rl-RANNOR(O) ; 
r2-RANNOR(0) ; 
r3«RANNOR(0) ; 
r4«RANNOR(0); 

proc  stepwise  data««randset ; 
by  set; 

model  y*  xl  x2  x3  el  rl  r2  r3  r4  /forward  slentry*!; 
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SAS  Program  STEP31_ALL.SAS 


option  linesize=80  pagesi2e=57; 
filename  new  '02.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3; 
data  randset; 
set  dataset; 
rl«RANNOR(0) ; 
r2®RANNOR(0) ; 
r3»RANNOR(0) ; 
r4=RANNOR(0)  ; 
r5»RANNOR(0) ; 
r6=RANNOR(0) ; 

proc  stepwise  data=randset; 
by  set ; 

model  y=  xl  x2  x3  el  e2  e3  rl  r2 
8lentry=l; 


3  r4  r5  r6  /forward 


I 


'04 .dat ' ; 


fileneune  new 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3; 
data  randset; 
set  dataset; 
rl«RANNOR(0) ; 
r2«RANNOR(0) ; 
r3»RANNOR(0); 
r4»RANNOR{0) ; 
r5=RANNOR(0) ; 
r6=RAiraOR(0) ; 

proc  stepwise  data»randset; 
by  set; 

model  y=«  xl  x2  x3  el  e2  e3  rl  r2  r3  r4  r5  r6 
slentry=l; 


/forward 


filenfune  new  '16.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3; 
data  randset; 
set  dataset; 
rl«RANNOR(0) ; 
r2=RANNOR(0); 
r3»RANNOR(0) ; 
r4=RANNOR(0); 


200 


r5»RANN0R ( 0 ) ; 
r6»^RANNOR(0) ; 

proc  stepwise  data=randset ; 
by  set; 

model  y=  xl  x2  x3  el  e2  e3  rl  r2  r3  r4  r5  r6  /forward 
slentry"!; 
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SAS  Program  STEP32_ALL.SAS 

i 


option  linesize=80  pagesize 
filename  new  'IS.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el 
data  randset; 
set  dataset; 
rl=RANNOR(0) ; 
r2=RANNOR(0) ; 
r3=RANNOR(0) ; 
r4«RANNOR(0) ; 
r5«RANNOR(0) ; 
r6=RANNOR(0) ; 

proc  stepwise  data=randset ; 
by  set; 

model  y=  xl  x2  x3  el  e2  e3 
slentry=l; 

filename  new  '20.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el 
data  randset; 
set  dataset; 
rl»RANNOR(0) ; 
r2-RANNOR(0) ; 
r3-RANNOR(0) ; 
r4-RANNOR(0) ; 
r5»RANNOR(0) ; 
r6-RANNOR{0) ; 

proc  stepwise  dataorandset ; 
by  set; 

model  y-  xl  x2  x3  el  e2  e3 
slentry=l; 


57; 

e2  e3; 

rl  r2  r3  r4  r5  r6  /forward 

e2  e3 ; 

rl  r2  r3  r4  r5  r6  /forward 


filename  new  '32.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3; 
data  randset; 
set  dataset; 
rl«RANNOR(0) ; 
r2»RANNOR{0) ; 
r3«RANNOR(0) ; 
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r4«RANNOR(0) ; 
r5»RANNOR(0) ; 
r6-RANNOR(0) ; 

prcx:  stepwise  data=randset ; 
by  set; 

model  y«  xl  x2  x3  el  e2  e3  rl  r2  r3  r4  r5  r6  /forward 
slentrys*!; 
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;  SAS  Program  STEP33_ALL.SAS 

9 

option  Iine8i2e=80  pagesize=57; 
filencune  new  '34.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3; 
data  randset ; 
set  dataset; 
rl«RANNOR(0) ; 
r2*RANNOR(0) ; 
r3=RANNOR(0) ; 
r4"RANNOR(0) ; 
r5“RANNOR{0) ; 
r6“RANNOR(0) ; 

proc  stepwise  data=randset ; 
by  set; 

model  y“  xl  x2  x3  el  e2  e3  rl  r2  r3  r4  r5  r6  /forward 
slentry*!; 

filename  new  '36.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3; 
data  randset; 
set  dataset; 
rl««RANNOR(0) ; 
r2-RANNOR(0) ; 
r3-RANNOR(0); 
r4*RANNOR(0) ; 
r5-RANNOR(0) ; 
r6«RANNOR(0); 

proc  stepwise  data«randset ; 
by  set; 

model  y«  xl  x2  x3  el  e2  e3  rl  r2  r3  r4  r5  r6  /forward 
slentry»l; 


filename  new  '48.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3; 
data  randset; 
set  dataset; 
rl-RANNOR(O) ; 
r2-RANNOR(0) ; 
r3«RANNOR(0); 
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r4*RANNOR(0) ; 
r5»RANNOR{0) ; 
r6»RANNOR(0) ; 

proc  stepwise  data»randset ; 
by  set; 

model  y"  xl  x2  x3  el  e2  e3  rl  r2  r3  r4  r5  r6  /forward 
8lentry=l; 


r 


SAS  Program  STEP34_ALL.SAS 

i 


option  linesize*=80  pagesize 
filename  new  '50.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el 
data  randset; 
set  dataset; 
rl«  RANNOR(O) ; 
r2»RANNOR(0) ; 
r3*RANNOR{0) ; 
r4-RANWOR(0) ; 
r5«RANIWR(0)  ; 
r6«RANNOR(0) ; 

proc  stepwise  data=rand3et ; 
by  set; 

model  y>  xl  x2  x3  el  e2  e3 
slentry*!; 

filenaune  new  '52.dat'; 
data  dataset; 
infile  new; 

input  set  y  xl  x2  x3  x4  el 
data  randset; 
set  dataset; 
rl-RANNOR(O) ; 
r2»RANNOR(0) ; 
r3-RANWOR(0) ; 
r4-RANNOR(0) ; 
r5»RANNOR(0) ; 
r6=RAiraOR(0)  ; 

proc  stepwise  data’^randset; 
by  set ; 

model  y»  xl  x2  x3  el  e2  e3 
slentry-l; 


filename  new  '64.dat'; 
dat<  dataset; 
inf  .xe  new; 

input  set  y  xl  xZ  x3  x4  el 
data  randset; 
set  dataset; 
rl»RANNOR(0) ; 
r2«RANlIOR(0) ; 


*57; 

e2  e3 ; 

rl  r2  r3  r4  r5  *C  /forward 

e2  e3; 

rl  r2  r3  r4  rS  r6  /forward 

e2  e3; 
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r3-RANNOR(0) ; 
r4-RANNOR(0) ; 
r5-RANNOR(0); 
r6-RANNOR(0) ; 

proc  stepwise  data«randset; 
by  set; 

Bodel  y-  xl  x2  x3  el  e2  e3  rl  r2  r3  r4  r5  r6  /forward 
slentry-l; 


imji.inpw  -iiuijiJ!  I 


;  SAS  Prograjtn  TM.SAS 

t 

option  linesize«80; 
filename  new  'TM.dat'; 
data  new; 
infile  new; 

input  DP  ymse  yep  yep  YMILLERS  CONST  A  B  AB  C  AC  BC  ABC  D 
AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABODE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF; 


PROC  PRINT; 

TITLE  'Analysis  of  Performance  Measures  and  Significant 
Contributing  Factors ' ; 

ID  DP; 

VAR  ymse  ysp  yep  YMILLERS  CONST  A  B  AB  C  AC  BC  ABC  D  AJ  BD 
ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  Cl  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  E:  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF; 


proc  stepwise; 

model  ymse  ■>  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCti  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstay-.Ol; 

proc  stepwise; 

model  ysp  =  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABODE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  ASF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  ABDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstay=.01; 

proc  stepwise; 

model  yep  -  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 
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CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDB  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  AEDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstay=.01; 

proc  stepwise; 

model  YMILLERS  »  A  B  AB  C  AC  BC  ABC  D  AD  BD  ABD 

CD  ACD  BCD  ABCD  E  AE  BE  ABE  CE  ACE  BCE  ABCE  DE 
ADE  BDE  ABDE  CDE  ACDE  BCDE  ABCDE  F  AF  BF  ABF  CF  ACF 
BCF  ABCF  DF  ADF  BDF  ABDF  CDF  ACDF  BCDF  ABCDF  EF  AEF 

BEF 

ABEF  CEF  ACEF  BCEF  ABCEF  DEF  ADEF  BDEF  AEDEF  CDEF 
ACDEF  BCDEF  ABCDEF 

/  stepwise  slstayo.Ol; 
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;  SAS  Program  TMSEP1_ALL.SAS 

/ 

filename  new  'Ol.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data=new  mse  sp  cp  b; 
by  set; 

model  y*®  xl  x2  x3  el  ; 

filename  new  '03.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  datas=new  mse  sp  cp  b; 
by  set; 

model  xl  x2  x3  el  ; 

filename  new  '05.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data^new  mse  sp  cp  b; 
by  set; 

model  y=  xl  x2  x3  el  ; 

filenaune  new  '07.dat'; 
data  new; 
infile  new ; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data=new  mse  sp  cp  b; 
by  set; 

model  y-  xl  x2  x3  el  ; 

fileneune  new  '09.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  datai^new  mse  sp  cp  b; 
by  set; 

model  y=  xl  x2  x3  el  ; 


filename  new  '63.dat'; 
data  new; 
infile  new; 
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input  set  y  xl  x2  x3  x4  el  ; 
proc  r square  data=new  mse  sp  cp  b; 
by  set; 

model  y=  xl  x2  x3  el  •; 
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/ 


SAS  Program  TMSEP3_ALL.SAS 

filename  new  '02.dat'; 
data  new; 
infile  nev;; 

input  set  y  xl  x2  x3  x4  el  e2  e3  0; 
proc  r square  data»new  mse  sp  cp  b; 
by  set; 

model  y=  xl  x2  x3  el  e2  e3; 

fileneune  new  '04.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3 
proc  rsquare  data=new  mse  sp  cp  b; 
by  set; 

model  y=  xl  x2  x3  el  e2  e3; 

filename  new  '06.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3  6; 
proc  rsquare  data-^new  mse  sp  cp  b; 
by  set ; 

model  y-  xl  x2  x3  el  e2  e3  ; 

filename  new  '08.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3 
proc  rsquare  data=new  mse  sp  cp  b; 
by  set; 

model  y=  xl  x2  x3  el  e2  e3  ; 

filename  new  'lO.dat'; 
data  new; 
infile  new; 

input  set  y  xl  x2  x3  x4  el  e2  e3  0; 
proc  rsquare  data^new  mse  sp  cp  b; 
by  set; 

model  y-  xl  x2  x3  el  e2  e3  ; 


filename  new  ' 64 . dat ' ; 
data  new; 
infile  new; 
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input  set  y  xl  x2  x3  x4  el  e2  e3  8; 
proc  r square  data=new  mse  sp  cp  b; 
by  set; 

model  y=  xl  x2  x3  el  e2  e3  ; 
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Appendix  K;  Calculated  Performance  Measure  Values  for  PM 


DP 

MSEp. 

Sp  pa 

Cp  pa 

MILLERSpa 

1 

0.8483146 

0.9230769 

0.9121622 

0.8721805 

2 

0.6313559 

0.7272727 

0.7127072 

0.7578948 

3 

0.8560000 

0.9203821 

0.9102167 

0.9187500 

4 

0.6569038 

0.7414248 

0.7315789 

0.8616352 

5 

0.8682311 

0.9304348 

0.9173947 

0.9230769 

6 

0.6528354 

0.7266436 

0.7217391 

0.8141593 

7 

0.8748318 

0.9348172 

0.9255814 

0.9259259 

8 

0.6569343 

0.7335058 

0.7278646 

0.8471338 

9 

0.8721560 

0.9282051 

0.9188514 

0.8688524 

10 

0.6557515 

0.7311715 

0.7254488 

0.7983153 

11 

0.8755596 

0.9261186 

0.9178499 

0.9322034 

12 

0.6597511 

0.7431272 

0.7348877 

0.8418079 

13 

0.8723077 

0.9229391 

0.9143357 

0.8769231 

14 

0.6631016 

0.7472284 

0.7362963 

0.8174603 

15 

0.8725817 

0.9200000 

0.9112782 

0.8994414 

16 

0.6633970 

0.7509628 

0.7395766 

0.8545455 

17 

0.8766962 

0.9201878 

0.9125575 

0.9352941 

18 

0.6724851 

0.7601580 

0.7501410 

0.8983051 

19 

0.8786920 

0.9198337 

0.9120046 

0.9179487 

20 

0.6803313 

0.7726582 

0.7640507 

0.8800000 

21 

0.8789900 

0.9225013 

0.9134766 

0.9076087 

22 

0.6854778 

0.7806333 

0.7723727 

0.8518519 

23 

0.8792879 

0.9256921 

0.9169847 

0.9322917 

24 

0.6923876 

0.7895842 

0.7820244 

0.8620690 

25 

0.8801917 

0.9274911 

0.9190726 

0.9470588 

26 

0.6963887 

0.7929838 

0.7863479 

0.8444445 

27 

0.8800296 

0.9282787 

0.9197581 

0.9270833 

28 

0.6977226 

0.7966524 

0.7905237 

0.8421053 

29 

0.8809360 

0.9288433 

0.9205695 

0.9180328 

30 

0.7002484 

0.7979497 

0.7921906 

0.8410257 

31 

0.8810290 

0.9304224 

0.9220280 

0.9090909 

32 

0.7022486 

0.8001853 

0.7948084 

0.8900000 

33 

0.8808873 

0.9289100 

0.9209040 

0.9345794 

34 

0.7007086 

0.7967742 

0.7916055 

0.7619048 

35 

0.8794798 

0.9267033 

0.9188332 

0.9057971 

36 

0.6968085 

0.7930265 

0.7888703 

0.8120806 

37 

0.8795711 

0.9254237 

0.9174978 

0.8983051 

38 

0.6940168 

0.7899920 

0.7851990 

0.7377049 

39 

0.8784777 

0.9255319 

0.9174870 

0.9071429 

40 

0.6909701 

0.7874936 

0.7823755 

0.8214286 

41 

0.8779142 

0.9241438 

0.9163203 

0.8938053 

42 

0.6885246 

0.7851059 

0.7803864 

0.7614679 

43 

0.8772098 

0.9234708 

0.9151068 

0.9127907 

44 

0.6867860 

0.7835648 

0.7789670 

0.8654971 

45 

0.8757166 

0.9216650 

0.9139459 

0.8770492 

46 

0.6866655 

0.7838199 

0.7796952 

0.7777778 
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Calculated  Performance  Measure  Values  for  PM  (continued) 


DP 

MSEp. 

Spp. 

Cp  p« 

MILLERSp. 

47 

0.8764809 

0.9221646 

0.9141683 

0.9204546 

48 

0.6851399 

0.7825346 

0.7784675 

0.8383234 

49 

0.8771561 

0.9231497 

0.9154701 

0.9268293 

50 

0.6869755 

0.7837176 

0.7800328 

0.8170732 

51 

0.8781572 

0.9236174 

0.9162409 

0.9125683 

52 

0.6898949 

0.7878609 

0.7846002 

0.9053254 

53 

0.8803269 

0.9259421 

0.9186071 

0.9695122 

54 

0.6912658 

0.7896428 

0.7863248 

0.8412699 

55 

0.8800600 

0.9260113 

0.5189467 

0.9349113 

56 

0.6936138 

0.7914847 

0.7882611 

0.8870968 

57 

0.8798051 

0.9265981 

0.9194219 

0.9259259 

58 

0.6940785 

0.7930974 

0.7898628 

0.8541667 

59 

0.8806620 

0.9273645 

0.9204350 

0.9421053 

60 

0.6952754 

0.7946640 

0.7912852 

0.9179487 

61 

0.8807000 

0.9280630 

0.9213628 

0.9301075 

62 

0.6971050 

0.7972907 

0.7940019 

0.8750000 

63 

0.8820463 

0.9288945 

0.9222420 

0.9230769 

64 

0.6984938 

0.7990220 

0.7955985 

0.8725491 
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Appendix  L;  Calculated  Performance  Measure  Value  for  TMSEP 


DP 

MSEt. 

Sp  til 

Cp  ta 

MiLLERSta 

1 

0.7650620 

0.8087910 

1.0234350 

1.1609520 

2 

0.6637320 

0.7893480 

1.0953370 

1.6926820 

3 

0.1487610 

0.1595710 

0.2381690 

0.1970290 

4 

0.1581330 

0.1644490 

0.2421660 

0.2214360 

5 

0.8993290 

0.9814150 

1.2547690 

1.3781180 

6 

0.6908280 

0.7437650 

1.2031360 

1.4195740 

7 

0.1405690 

0.1471160 

0.2459100 

0.1979690 

8 

0.1488380 

0.1552430 

0.2376890 

0.2127430 

9 

77.9380340 

83.1626890 

108.2751310 

132.1354680 

10 

56.1681590 

61.7997280 

98.8873060 

135.7216490 

11 

10.2727160 

10.7429670 

20.8164410 

12.9381190 

12 

7.9198800 

9.1730830 

20.6267190 

15.4755950 

13 

75.7267150 

82.3265460 

113.1051330 

131.7463990 

14 

53.3580280 

61.4832120 

107.3695450 

141.4950870 

15 

9.5692500 

9.8932590 

19.2865920 

12.2740480 

16 

8.0496980 

8.8758070 

18.0312250 

16.5588720 

17 

0.9292540 

0.9358480 

1.4927760 

1.1517850 

18 

0.8593490 

0.8651500 

1.4510400 

1.0863170 

19 

0.1407100 

0.1423830 

0.2824660 

0.1426570 

20 

0.1367260 

0.1338730 

0.2878020 

0.1418870 

21 

0.9471070 

0.9643650 

1.4791970 

1.0234790 

22 

0.8688670 

0.8869560 

1.5071520 

1.0349690 

23 

0.1427590 

0.1425130 

0.2991620 

0.1468790 

24 

0.1363060 

0.1346770 

0.2990880 

0.1383570 

25 

89.9664610 

90.8934100 

147.2874150 

101.9961320 

26 

84.1691510 

85.6997380 

141.4162900 

102.0691220 

27 

12.1973890 

12.3373200 

26.9606130 

12.7069930 

28 

10.9168790 

11.2033390 

26.4688630 

12.9651400 

29 

91.0713040 

91.3485720 

142.1083830 

101.7852550 

30 

84.5563960 

85.3022690 

144.7259980 

98.6170500 

31 

12.4259120 

12.5170760 

28.0536590 

12.4964480 

32 

11.2453370 

11.4638000 

27.5978010 

12.0174870 

33 

0.9804800 

1.1347960 

1.3585030 

1.5437970 

34 

0.8325220 

0.3686400 

1.1092190 

1.4102540 

i35 

0.2638300 

0.2741530 

0.2970570 

0.3223040 

36 

0.3523780 

0.3504670 

0.360928C 

0.3678640 

37 

0.9159700 

0.9888550 

1.3031770 

1.4501740 

38 

0.9021820 

0.9639130 

1.1355340 

1.3516600 

39 

0.2787260 

0.2903530 

0.3201970 

0.3200920 

40 

0.3576060 

0.3316960 

0.3655560 

0.3798990 

41 

68.9201740 

74.7250060 

108.4361880 

124.8693010 

42 

57.7932780 

65.6259770 

97.4607010 

132.9111790 

43 

9.4603400 

9.7781960 

18.6886480 

13.1972780 

44 

8.4567630 

9.0632910 

20.8584960 

15.7659790 

45 

79.2766190 

83.3591160 

107.4557270 

119.1930390 
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Calculated  Performance  Measure  Value  ft 


DP 

MSEt. 

Sp  ta 

Cp  ta 

46 

62.0207100 

70.4859700 

102.2720180 

47 

10.2950660 

10.8312980 

19.9675880 

48 

7.6850630 

9.1260600 

19.7178250 

49 

0.9873370 

0.9887390 

1.5157680 

50 

0.9494320 

0.9554130 

1.4946450 

51 

0.1949640 

0.1986370 

0.3434490 

52 

0.2108450 

0.2073290 

0.3380340 

53 

1.0238520 

1.0291120 

1.5982300 

54 

0.9318340 

0.9543170 

1.5202000 

55 

0.1930650 

0.1973760 

0.3198640 

56 

0.2103010 

0.2086840 

0.3424410 

57 

92.1139220 

92.9549100 

148.0592800 

58 

83.7433850 

85.6070180 

142.4074100 

59 

12.0630550 

12.1251390 

26.6133730 

60 

10.9248680 

11.0991730 

26.7097450 

61 

93.2993090 

94.0328060 

151.9314420 

62 

85.7757420 

87.8176350 

140.6162720 

63 

11.7405510 

11.8084090 

26.8109000 

64 

11.1539340 

11.2962640 

25.5540160 

TMSEP 


MiLLERSta 

134.2141570 

12.3031880 

17.8853320 

1.2343050 

1.3705840 

0.2177160 

0.2504210 

1.1669810 

1.0758220 

0.2236930 

0.2333300 

119.4541400 

96.1285930 

12.2637760 

11.5297640 

96.3185420 

106.2438430 

11.8390310 

11.6543080 
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